Saved for Later

What's Working for YC Companies Since the AI Boom

2025-05-30T14:24:36.000Z

Microsandbox: Virtual Machines that feel and perform like containers

2025-05-30T13:20:04.000Z

Let’s DO this: detecting Workers Builds errors across 1 million Durable Objects

2025-05-29T13:00:00.000Z

Cloudflare Workers Builds is our CI/CD product that makes it easy to build and deploy Workers applications every time code is pushed to GitHub or GitLab. What makes Workers Builds special is that projects can be built and deployed with minimal configuration. Just hook up your project and let us take care of the rest!

But what happens when things go wrong, such as failing to install tools or dependencies? What usually happens is that we don’t fix the problem until a customer contacts us about it, at which point many other customers have likely faced the same issue. This can be a frustrating experience for both us and our customers because of the lag time between issues occurring and us fixing them.

We want Workers Builds to be reliable, fast, and easy to use so that developers can focus on building, not dealing with our bugs. That’s why we recently started building an error detection system that can detect, categorize, and surface all build issues occurring on Workers Builds, enabling us to proactively fix issues and add missing features.

It’s also no secret that we’re big fans of being “Customer Zero” at Cloudflare, and Workers Builds is itself a product that’s built end-to-end on our Developer Platform using Workers, Durable Objects, Hyperdrive, Containers, Queues, Workers KV, R2, and Workers Observability.

In this post, we will dive into how we used the Cloudflare Developer Platform to check for issues across more than 1 million Durable Objects.

Background: Workers Builds architecture

Back in October 2024, we wrote about how we built Workers Builds entirely on the Workers platform. To recap, Builds is built using Workers, Durable Objects, Workers KV, R2, Queues, Hyperdrive, and a Postgres database. Some of these things were not present when launched back in October (for example, Queues and KV). But the core of the architecture is the same.

A client Worker receives GitHub/GitLab webhooks and stores build metadata in Postgres (via Hyperdrive). A build management Worker uses two Durable Object classes: a Scheduler class to find builds in Postgres that need scheduling, and a class called BuildBuddy to manage the lifecycle of a build. When a build needs to be started, Scheduler creates a new BuildBuddy instance which is responsible for creating a container for the build (using Cloudflare Containers), monitoring the container with health checks, and receiving build logs so that they can be viewed in the Cloudflare Dashboard.

In addition to this core scheduling logic, we have several Workers Queues for background work such as sending PR comments to GitHub/GitLab.

The problem: builds are failing

While this architecture has worked well for us so far, we found ourselves with a problem: compared to Cloudflare Pages, a concerning percentage of builds were failing. We needed to dig deeper and figure out what was wrong, and understand how we could improve Workers Builds so that developers can focus more on shipping instead of build failures.

Types of build failures

Not all build failures are the same. We have several categories of failures that we monitor:

Initialization failures: when the container fails to start.
Clone failures: failing to clone the repository from GitHub/GitLab.
Build timeouts: builds that ran past the limit and were terminated by BuildBuddy.
Builds failing health checks: the container stopped responding to health checks, e.g. the container crashed for an unknown reason.
Failure to install tools or dependencies.
Failed user build/deploy commands.

The first few failure types were straightforward, and we’ve been able to track down and fix issues in our build system and control plane to improve what we call “build completion rate”. We define build completion as the following:

We successfully started the build.
We attempted to install tools/dependencies (considering failures as “user error”).
We attempted to run the user-defined build/deploy commands (again, considering failures as “user error”).
We successfully marked the build as stopped in our database.

For example, we had a bug where builds for a deleted Worker would attempt to run and continuously fail, which affected our build completion rate metric.

User error

We’ve made a lot of progress improving the reliability of build and container orchestration, but we had a significant percentage of build failures in the “user error” metric. We started asking ourselves “is this actually user error? Or is there a problem with the product itself?”

This presented a challenge because questions like “did the build command fail due to a bug in the build system, or user error?” are a lot harder to answer than pass/fail issues like failing to create a container for the build. To answer these questions, we had to build something new, something smarter.

Build logs

The most obvious way to determine why a build failed is to look at its logs. When spot-checking build failures, we can typically identify what went wrong. For example, some builds fail to install dependencies because of an out of date lockfile (e.g. package-lock.json out of date with package.json). But looking through build failures one by one doesn’t scale. We didn’t want engineers looking through customer build logs without at least suspecting that there was an issue with our build system that we could fix.

Automating error detection

At this point, next steps were clear: we needed an automated way to identify why a build failed based on build logs, and provide a way for engineers to see what the top issues were while ensuring privacy (e.g. removing account-specific identifiers and file paths from the aggregate data).

Detecting errors in build logs using Workers Queues

The first thing we needed was a way to categorize build errors after a build fails. To do this, we created a queue named BuildErrorsQueue to process builds and look for errors. After a build fails, BuildBuddy will send the build ID to BuildErrorsQueue which fetches the logs, checks for issues, and saves results to Postgres.

We started out with a few static patterns to match things like Wrangler errors in log lines:

export const DetectedErrorCodes = {
  wrangler_error: {
    detect: async (lines: LogLines) => {
      const errors: DetectedError[] = []
      for (const line of lines) {
        if (line[2].trim().startsWith('✘ [ERROR]')) {
          errors.push({
            error_code: 'wrangler_error',
            error_group: getWranglerLogGroupFromLogLine(line, wranglerRegexMatchers),
            detected_on: new Date(),
            lines_matched: [line],
          })
        }
      }
      return errors
    },
  },
  installing_tools_or_dependencies_failed: { ... },
}

It wouldn’t be useful if all Wrangler errors were grouped under a single generic “wrangler_error” code, so we further grouped them by normalizing the log lines into groups:

function getWranglerLogGroupFromLogLine(
  logLine: LogLine,
  regexMatchers: RegexMatcher[]
): string {
  const original = logLine[2].trim().replaceAll(/[\t\n\r]+/g, ' ')
  let message = original
  let group = original
  for (const { mustMatch, patterns, stopOnMatch, name, useNameAsGroup } of regexMatchers) {
    if (mustMatch !== undefined) {
      const matched = matchLineToRegexes(message, mustMatch)
      if (!matched) continue
    }
    if (patterns) {
      for (const [pattern, mask] of patterns) {
        message = message.replaceAll(pattern, mask)
      }
    }
    if (useNameAsGroup === true) {
      group = name
    } else {
      group = message
    }
    if (Boolean(stopOnMatch) && message !== original) break
  }
  return group
}

const wranglerRegexMatchers: RegexMatcher[] = [
  {
    name: 'could_not_resolve',
    // ✘ [ERROR] Could not resolve "./balance"
    // ✘ [ERROR] Could not resolve "node:string_decoder" (originally "string_decoder/")
    mustMatch: [/^✘ \[ERROR\] Could not resolve "[@\w :/\\.-]*"/i],
    stopOnMatch: true,
    patterns: [
      [/(?<=^✘ \[ERROR\] Could not resolve ")[@\w :/\\.-]*(?=")/gi, ''],
      [/(?<=\(originally ")[@\w :/\\.-]*(?=")/gi, ''],
    ],
  },
  {
    name: 'no_matching_export_for_import',
    // ✘ [ERROR] No matching export in "src/db/schemas/index.ts" for import "someCoolTable"
    mustMatch: [/^✘ \[ERROR\] No matching export in "/i],
    stopOnMatch: true,
    patterns: [
      [/(?<=^✘ \[ERROR\] No matching export in ")[@~\w:/\\.-]*(?=")/gi, ''],
      [/(?<=" for import ")[\w-]*(?=")/gi, ''],
    ],
  },
  // ...many more added over time
]

Once we had our error detection matchers and normalizing logic in place, implementing the BuildErrorsQueue consumer was easy:

export async function handleQueue(
  batch: MessageBatch,
  env: Bindings,
  ctx: ExecutionContext
): Promise {
  ...
  await pMap(batch.messages, async (msg) => {
    try {
      const { build_id } = BuildErrorsQueueMessageBody.parse(msg.body)
      await store.buildErrors.deleteErrorsByBuildId({ build_id })
      const bb = getBuildBuddy(env, build_id)
      const errors: DetectedError[] = []
      let cursor: LogsCursor | undefined
      let hasMore = false

      do {
        using maybeNewLogs = await bb.getLogs(cursor, false)
        const newLogs = LogsWithCursor.parse(maybeNewLogs)
        cursor = newLogs.cursor
        const newErrors = await detectErrorsInLogLines(newLogs.lines)
        errors.push(...newErrors)
        hasMore = Boolean(cursor) && newLogs.lines.length > 0
      } while (hasMore)

      if (errors.length > 0) {
        await store.buildErrors.insertErrors(
          errors.map((e) => ({
            build_id,
            error_code: e.error_code,
            error_group: e.error_group,
          }))
        )
      }
      msg.ack()
    } catch (e) {
      msg.retry()
      sentry.captureException(e)
    }
  })
}

Here, we’re fetching logs from each build’s BuildBuddy Durable Object, detecting why it failed using the matchers we wrote, and saving errors to the Postgres DB. We also delete any existing errors for when we improve our error detection patterns to prevent subsequent runs from adding duplicate data to our database.

What about historical builds?

The BuildErrorsQueue was great for new builds, but this meant we still didn’t know why all the previous build failures happened other than “user error”. We considered only tracking errors in new builds, but this was unacceptable because it would significantly slow down our ability to improve our error detection system because each iteration would require us to wait days to identify issues we need to prioritize.

Problem: logs are stored across one million+ Durable Objects

Remember how every build has an associated BuildBuddy DO to store logs? This is a great design for ensuring our logging pipeline scales with our customers, but it presented a challenge when trying to aggregate issues based on logs because something would need to go through all historical builds (>1 million at the time) to fetch logs and detect why they failed.

If we were using Go and Kubernetes, we might solve this using a long-running container that goes through all builds and runs our error detection. But how do we solve this in Workers?

How do we backfill errors for historical builds?

At this point, we already had the Queue to process new builds. If we could somehow send all of the old build IDs to the queue, it could scan them all quickly using Queues concurrent consumers to quickly work through all builds. We thought about hacking together a local script to fetch all of the log IDs and sending them to an API to put them on a queue. But we wanted something more secure and easier to use so that running a new backfill was as simple as an API call.

That’s when an idea hit us: what if we used a Durable Object with alarms to fetch a range of builds and send them to BuildErrorsQueue? At first, it seemed far-fetched, given that Durable Object alarms have a limited amount of work they can do per invocation. But wait, if AI Agents built on Durable Objects can manage background tasks, why can’t we fetch millions of build IDs and forward them to queues?

Building a Build Errors Agent with Durable Objects

The idea was simple: create a Durable Object class named BuildErrorsAgent and run a single instance that loops through the specified range of builds in the database and sends them to BuildErrorsQueue.

The first thing we did was set up an RPC method to start a backfill and save the parameters in Durable Object KV storage so that it can be read each time the alarm executes:

async start({
  min_build_id,
  max_build_id,
}: {
  min_build_id: BuildRecord['build_id']
  max_build_id: BuildRecord['build_id']
}): Promise {
  logger.setTags({ handler: 'start', environment: this.env.ENVIRONMENT })
  try {
    if (min_build_id < 0) throw new Error('min_build_id cannot be negative')
    if (max_build_id < min_build_id) {
      throw new Error('max_build_id cannot be less than min_build_id')
    }
    const [started_on, stopped_on] = await Promise.all([
      this.kv.get('started_on'),
      this.kv.get('stopped_on'),
    ])
    await match({ started_on, stopped_on })
      .with({ started_on: P.not(null), stopped_on: P.nullish }, () => {
        throw new Error('BuildErrorsAgent is already running')
      })
      .otherwise(async () => {
        // delete all existing data and start queueing failed builds
        await this.state.storage.deleteAlarm()
        await this.state.storage.deleteAll()
        this.kv.put('started_on', new Date())
        this.kv.put('config', { min_build_id, max_build_id })
        void this.state.storage.setAlarm(this.getNextAlarmDate())
      })
  } catch (e) {
    this.sentry.captureException(e)
    throw e
  }
}

The most important part of the implementation is the alarm that runs every second until the job is complete. Each alarm invocation has the following steps:

Set a new alarm (always first to ensure an error doesn’t cause it to stop).
Retrieve state from KV.
Validate that the agent is supposed to be running:
1. Ensure the agent is supposed to be running.
2. Ensure we haven’t reached the max build ID set in the config.
Finally, queue up another batch of builds by querying Postgres and sending to the BuildErrorsQueue.

async alarm(): Promise {
  logger.setTags({ handler: 'alarm', environment: this.env.ENVIRONMENT })
  try {
    void this.state.storage.setAlarm(Date.now() + 1000)
    const kvState = await this.getKVState()
    this.sentry.setContext('BuildErrorsAgent', kvState)
    const ctxLogger = logger.withFields({ state: JSON.stringify(kvState) })

    await match(kvState)
      .with({ started_on: P.nullish }, async () => {
        ctxLogger.info('BuildErrorsAgent is not started, cancelling alarm')
        await this.state.storage.deleteAlarm()
      })
      .with({ stopped_on: P.not(null) }, async () => {
        ctxLogger.info('BuildErrorsAgent is stopped, cancelling alarm')
        await this.state.storage.deleteAlarm()
      })
      .with(
        // we should never have started_on set without config set, but just in case
        { started_on: P.not(null), config: P.nullish },
        async () => {
          const msg =
            'BuildErrorsAgent started but config is empty, stopping and cancelling alarm'
          ctxLogger.error(msg)
          this.sentry.captureException(new Error(msg))
          this.kv.put('stopped_on', new Date())
          await this.state.storage.deleteAlarm()
        }
      )
      .when(
        // make sure there are still builds to enqueue
        (s) =>
          s.latest_build_id !== null &&
          s.config !== null &&
          s.latest_build_id >= s.config.max_build_id,
        async () => {
          ctxLogger.info('BuildErrorsAgent job complete, cancelling alarm')
          this.kv.put('stopped_on', new Date())
          await this.state.storage.deleteAlarm()
        }
      )
      .with(
        {
          started_on: P.not(null),
          stopped_on: P.nullish,
          config: P.not(null),
          latest_build_id: P.any,
        },
        async ({ config, latest_build_id }) => {
          // 1. select batch of ~1000 builds
          // 2. send them to Queues 100 at a time, updating
          //    latest_build_id after each batch is sent
          const failedBuilds = await this.store.builds.selectFailedBuilds({
            min_build_id: latest_build_id !== null ? latest_build_id + 1 : config.min_build_id,
            max_build_id: config.max_build_id,
            limit: 1000,
          })
          if (failedBuilds.length === 0) {
            ctxLogger.info(`BuildErrorsAgent: ran out of builds, stopping and cancelling alarm`)
            this.kv.put('stopped_on', new Date())
            await this.state.storage.deleteAlarm()
          }

          for (
            let i = 0;
            i < BUILDS_PER_ALARM_RUN && i < failedBuilds.length;
            i += QUEUES_BATCH_SIZE
          ) {
            const batch = failedBuilds
              .slice(i, QUEUES_BATCH_SIZE)
              .map((build) => ({ body: build }))

            if (batch.length === 0) {
              ctxLogger.info(`BuildErrorsAgent: ran out of builds in current batch`)
              break
            }
            ctxLogger.info(
              `BuildErrorsAgent: sending ${batch.length} builds to build errors queue`
            )
            await this.env.BUILD_ERRORS_QUEUE.sendBatch(batch)
            this.kv.put(
              'latest_build_id',
              Math.max(...batch.map((m) => m.body.build_id).concat(latest_build_id ?? 0))
            )

            this.kv.put(
              'total_builds_processed',
              ((await this.kv.get('total_builds_processed')) ?? 0) + batch.length
            )
          }
        }
      )
      .otherwise(() => {
        const msg = 'BuildErrorsAgent has nothing to do - this should never happen'
        this.sentry.captureException(msg)
        ctxLogger.info(msg)
      })
  } catch (e) {
    this.sentry.captureException(e)
    throw e
  }
}

Using pattern matching with ts-pattern made it much easier to understand what states we were expecting and what will happen compared to procedural code. We considered using a more powerful library like XState, but decided on ts-pattern due to its simplicity.

Running the backfill

Once everything rolled out, we were able to trigger an errors backfill for over a million failed builds in a couple of hours with a single API call, categorizing 80% of failed builds on the first run. With a fast backfill process, we were able to iterate on our regex matchers to further refine our error detection and improve error grouping. Here’s what the error list looks like in our staging environment:

Fixes and improvements

Having a better understanding of what’s going wrong has already enabled us to make several improvements:

Wrangler now shows a clearer error message when no config file is found.
Fixed multiple edge-cases where the wrong package manager was used in TypeScript/JavaScript projects.
Added support for bun.lock (previously only checked for bun.lockb).
Fixed several edge cases where build caching did not work in monorepos.
Projects that use a runtime.txt file to specify a Python version no longer fail.
….and more!

We’re still working on fixing other bugs we’ve found, but we’re making steady progress. Reliability is a feature we’re striving for in Workers Builds, and this project has helped us make meaningful progress towards that goal. Instead of waiting for people to contact support for issues, we’re able to proactively identify and fix issues (and catch regressions more easily).

One of the great things about building on the Developer Platform is how easy it is to ship things. The core of this error detection pipeline (the Queue and Durable Object) only took two days to build, which meant we could spend more time working on improving Workers Builds instead of spending weeks on the error detection pipeline itself.

What’s next?

In addition to continuing to improve build reliability and speed, we’ve also started thinking about other ways to help developers build their applications on Workers. For example, we built a Builds MCP server that allows users to debug builds directly in Cursor/Claude/etc. We’re also thinking about ways we can expose these detected issues in the Cloudflare Dashboard so that users can identify issues more easily without scrolling through hundreds of logs.

Ready to get started?

Building applications on Workers has never been easier! Try deploying a Durable Object-backed chat application with Workers Builds:

Building a Distributed Cache for S3

2025-05-29T06:52:40.000Z

Reverse engineering of Linear's sync engine

2025-05-29T04:29:30.000Z

I am disappointed in the AI discourse

2025-05-28T17:35:48.000Z

Yeah I know this place is generally super anti-AI. But I figured it’s dishonest to not also post it here. I’d love to see more nuanced posts on this topic here.

DuckLake is an integrated data lake and catalog format

2025-05-27T13:43:11.000Z

Just make it scale: An Aurora DSQL story

2025-05-27T11:31:02.000Z

I think it's time to give Nix a chance

2025-05-26T15:56:15.000Z

Claude Code does our releases now

2025-05-26T03:22:14.000Z

Beware the Complexity Merchants

2025-05-25T19:25:58.000Z

I used o3 to find a remote zeroday in the Linux SMB implementation

2025-05-24T14:25:45.000Z

Alasdair MacIntyre Has Died

2025-05-23T11:37:07.000Z

Claude 4

2025-05-22T16:34:42.000Z

How I used o3 to find a remote 0-day vulnerability in the Linux kernel (ksmbd)

2025-05-22T10:44:48.000Z

Encore's MCP Server enables your AI tools to introspect your application

2025-05-22T07:56:32.000Z

Configure System Integrity Protection (SIP) on Amazon EC2 Mac instances

2025-05-21T17:36:31.000Z

I’m pleased to announce developers can now programmatically disable Apple System Integrity Protection (SIP) on their Amazon EC2 Mac instances. System Integrity Protection (SIP), also known as rootless, is a security feature introduced by Apple in OS X El Capitan (2015, version 10.11). It’s designed to protect the system from potentially harmful software by restricting the power of the root user account. SIP is enabled by default on macOS.

SIP safeguards the system by preventing modification of protected files and folders, restricting access to system-owned files and directories, and blocking unauthorized software from selecting a startup disk. The primary goal of SIP is to address the security risk linked to unrestricted root access, which could potentially allow malware to gain full control of a device with just one password or vulnerability. By implementing this protection, Apple aims to ensure a higher level of security for macOS users, especially considering that many users operate on administrative accounts with weak or no passwords.

While SIP provides excellent protection against malware for everyday use, developers might occasionally need to temporarily disable it for development and testing purposes. For instance, when creating a new device driver or system extension, disabling SIP is necessary to install and test the code. Additionally, SIP might block access to certain system settings required for your software to function properly. Temporarily disabling SIP grants you the necessary permissions to fine-tune programs for macOS. However, it’s crucial to remember that this is akin to briefly disabling the vault door for authorized maintenance, not leaving it permanently open.

Disabling SIP on a Mac requires physical access to the machine. You have to restart the machine in recovery mode, then disable SIP with the csrutil command line tool, then restart the machine again.

Until today, you had to operate with the standard SIP settings on EC2 Mac instances. The physical access requirement and the need to boot in recovery mode made integrating SIP with the Amazon EC2 control plane and EC2 API challenging. But that’s no longer the case! You can now disable and re-enable SIP at will on your Amazon EC2 Mac instances. Let me show you how.

Let’s see how it works
Imagine I have an Amazon EC2 Mac instance started. It’s a mac2-m2.metal instance, running on an Apple silicon M2 processor. Disabling or enabling SIP is as straightforward as calling a new EC2 API: CreateMacSystemIntegrityProtectionModificationTask. This API is asynchronous; it starts the process of changing the SIP status on your instance. You can monitor progress using another new EC2 API: DescribeMacModificationTasks. All I need to know is the instance ID of the machine I want to work with.

Prerequisites
On Apple silicon based EC2 Mac instances and more recent type of machines, before calling the new EC2 API, I must set the ec2-user user password and enable secure token for that user on macOS. This requires connecting to the machine and typing two commands in the terminal.

# on the target EC2 Mac instance
# Set a password for the ec2-user user
~ % sudo /usr/bin/dscl . -passwd /Users/ec2-user
New Password: (MyNewPassw0rd)

# Enable secure token, with the same password, for the ec2-user
# old password is the one you just set with dscl
~ % sysadminctl -newPassword MyNewPassw0rd -oldPassword MyNewPassw0rd
2025-03-05 13:16:57.261 sysadminctl[3993:3033024] Attempting to change password for ec2-user…
2025-03-05 13:16:58.690 sysadminctl[3993:3033024] SecKeychainCopyLogin returned -25294
2025-03-05 13:16:58.690 sysadminctl[3993:3033024] Failed to update keychain password (-25294)
2025-03-05 13:16:58.690 sysadminctl[3993:3033024] - Done

# The error about the KeyChain is expected. I never connected with the GUI on this machine, so the Login keychain does not exist
# you can ignore this error.  The command below shows the list of keychains active in this session
~ % security list
    "/Library/Keychains/System.keychain"

# Verify that the secure token is ENABLED
~ % sysadminctl -secureTokenStatus ec2-user
2025-03-05 13:18:12.456 sysadminctl[4017:3033614] Secure token is ENABLED for user ec2-user

Change the SIP status
I don’t need to connect to the machine to toggle the SIP status. I only need to know its instance ID. I open a terminal on my laptop and use the AWS Command Line Interface (AWS CLI) to retrieve the Amazon EC2 Mac instance ID.

 aws ec2 describe-instances \
         --query "Reservations[].Instances[?InstanceType == 'mac2-m2.metal' ].InstanceId" \
         --output text

i-012a5de8da47bdff7

Now, still from the terminal on my laptop, I disable SIP with the create-mac-system-integrity-protection-modification-task command:

echo '{"rootVolumeUsername":"ec2-user","rootVolumePassword":"MyNewPassw0rd"}' > tmpCredentials
aws ec2 create-mac-system-integrity-protection-modification-task \
--instance-id "i-012a5de8da47bdff7" \
--mac-credentials fileb://./tmpCredentials \
--mac-system-integrity-protection-status "disabled" && rm tmpCredentials

{
    "macModificationTask": {
        "instanceId": "i-012a5de8da47bdff7",
        "macModificationTaskId": "macmodification-06a4bb89b394ac6d6",
        "macSystemIntegrityProtectionConfig": {},
        "startTime": "2025-03-14T14:15:06Z",
        "taskState": "pending",
        "taskType": "sip-modification"
    }
}

After the task is started, I can check its status with the aws ec2 describe-mac-modification-tasks command.

{
    "macModificationTasks": [
        {
            "instanceId": "i-012a5de8da47bdff7",
            "macModificationTaskId": "macmodification-06a4bb89b394ac6d6",
            "macSystemIntegrityProtectionConfig": {
                "debuggingRestrictions": "",
                "dTraceRestrictions": "",
                "filesystemProtections": "",
                "kextSigning": "",
                "nvramProtections": "",
                "status": "disabled"
            },
            "startTime": "2025-03-14T14:15:06Z",
            "tags": [],
            "taskState": "in-progress",
            "taskType": "sip-modification"
        },
...

The instance initiates the process and a series of reboots, during which it becomes unreachable. This process can take 60–90 minutes to complete. After that, when I see the status in the console becoming available again, I connect to the machine through SSH or EC2 Instance Connect, as usual.

➜  ~ ssh ec2-user@54.99.9.99
Warning: Permanently added '54.99.9.99' (ED25519) to the list of known hosts.
Last login: Mon Feb 26 08:52:42 2024 from 1.1.1.1

    ┌───┬──┐   __|  __|_  )
    │ ╷╭╯╷ │   _|  (     /
    │  └╮  │  ___|\___|___|
    │ ╰─┼╯ │  Amazon EC2
    └───┴──┘  macOS Sonoma 14.3.1

➜  ~ uname -a
Darwin Mac-mini.local 23.3.0 Darwin Kernel Version 23.3.0: Wed Dec 20 21:30:27 PST 2023; root:xnu-10002.81.5~7/RELEASE_ARM64_T8103 arm64

➜ ~ csrutil --status 
System Integrity Protection status: disabled.

When to disable SIP
Disabling SIP should be approached with caution because it opens up the system to potential security risks. However, as I mentioned in the introduction of this post, you might need to disable SIP when developing device drivers or kernel extensions for macOS. Some older applications might also not function correctly when SIP is enabled.

Disabling SIP is also required to turn off Spotlight indexing. Spotlight can help you quickly find apps, documents, emails and other items on your Mac. It’s very convenient on desktop machines, but not so much on a server. When there is no need to index your documents as they change, turning off Spotlight will release some CPU cycles and disk I/O.

Things to know
There are a couple of additional things to know about disabling SIP on Amazon EC2 Mac:

Disabling SIP is available through the API and AWS SDKs, the AWS CLI, and the AWS Management Console.
On Apple silicon, the setting is volume based. So if you replace the root volume, you need to disable SIP again. On Intel, the setting is Mac host based, so if you replace the root volume, SIP will still be disabled.
After disabling SIP, it will be enabled again if you stop and start the instance. Rebooting an instance doesn’t change its SIP status.
SIP status isn’t transferable between EBS volumes. This means SIP will be disabled again after you restore an instance from an EBS snapshot or if you create an AMI from an instance where SIP is enabled.

These new APIs are available in all Regions where Amazon EC2 Mac is available, at no additional cost. Try them today.

— seb

How is the News Blog doing? Take this 1 minute survey!

(This survey is hosted by an external company. AWS handles your information as described in the AWS Privacy Notice. AWS will own the data gathered via this survey and will not share the information collected with survey respondents.)

Show HN: Representing Agents as MCP Servers

2025-05-21T17:19:52.000Z

LLM function calls don't scale; code orchestration is simpler, more effective

2025-05-21T17:18:52.000Z

Hypervisor as a Library

2025-05-20T15:17:44.000Z

OpenAI Codex Review

2025-05-20T14:29:00.000Z

OpenAI Codex hands-on review

2025-05-20T14:29:00.000Z

Hypervisor as a Library

2025-05-20T06:07:19.000Z

Cleo, the mathematician that tricked Stack Exchange

2025-05-20T05:47:55.000Z

Show HN: Claude Code in the Cloud

2025-05-19T23:02:38.000Z

Jules: An Asynchronous Coding Agent

2025-05-19T21:12:47.000Z

Claude Code SDK – Anthropic

2025-05-19T18:04:06.000Z

Claude Code SDK

2025-05-19T18:04:06.000Z

GitHub Copilot Coding Agent

2025-05-19T16:17:56.000Z

Kelp – simple replacement for homebrew on macOS

2025-05-19T14:55:53.000Z

Airport for DuckDB

2025-05-19T11:25:32.000Z

Understanding the Go Scheduler

2025-05-18T17:03:55.000Z

Push Ifs Up And Fors Down (2023)

2025-05-17T20:08:29.000Z

Production tests: a guidebook for better systems and more sleep

2025-05-17T09:13:16.000Z

A Research Preview of Codex

2025-05-16T15:02:02.000Z

Postgres with data branching and PII anonymization

2025-05-15T08:14:29.000Z

Keeping time on a stream

2025-05-14T18:11:43.000Z

Tailscale 4via6 – Connect Edge Deployments at Scale

2025-05-12T14:00:32.000Z

Ask HN: Cursor or Windsurf?

2025-05-12T04:41:50.000Z

Bento gets a makeover

2025-05-08T21:30:41.000Z

A flat pricing subscription for Claude Code

2025-05-08T21:12:32.000Z

Void: Open-source Cursor alternative

2025-05-08T16:35:34.000Z

Notes on rolling out Cursor and Claude Code

2025-05-08T16:34:39.000Z

Accelerate the transfer of data from an Amazon EBS snapshot to a new EBS volume

2025-05-06T21:52:26.000Z

Today we are announcing the general availability of Amazon Elastic Block Store (Amazon EBS) Provisioned Rate for Volume Initialization, a feature that accelerates the transfer of data from an EBS snapshot, a highly durable backup of volumes stored in Amazon Simple Storage Service (Amazon S3) to a new EBS volume.

With Amazon EBS Provisioned Rate for Volume Initialization, you can create fully performant EBS volumes within a predictable amount of time. You can use this feature to speed up the initialization of hundreds of concurrent volumes and instances. You can also use this feature when you need to recover from an existing EBS Snapshot and need your EBS volume to be created and initialized as quickly as possible. You can use this feature to quickly create copies of EBS volumes with EBS Snapshots in a different Availability Zone, AWS Region, or AWS account. Provisioned Rate for Volume Initialization for each volume is charged based on the full snapshot size and the specified volume initialization rate.

This new feature expedites the volume initialization process by fetching the data from an EBS Snapshot to an EBS volume at a consistent rate that you specify between 100 MiB/s and 300 MiB/s. You can specify this volume initialization rate at which the snapshot blocks are to be downloaded from Amazon S3 to the volume.

With specifying the volume initialization rate, you can create a fully performant volume in a predictable time, enabling increased operational efficiency and visibility on the expected time of completion. If you run utilities like fio/dd to expedite volume initialization for your workflows like application recovery and volume copy for testing and development, it will remove the operational burden of managing such scripts with the consistency and predictability to your workflows.

Get started with specifying the volume initialization rate
To get started, you can choose the volume initialization rate when you launch your EC2 instance or create your volume from the snapshot.

1. Create a volume in the EC2 launch wizard
When launching new EC2 instances in the launch wizard of EC2 console, you can enter a desired Volume initialization rate in the Storage (volumes) section.

You can also set the volume initialization rate when creating and modifying the EC2 Launch Templates.

In the AWS Command Line Interface (AWS CLI), you can add VolumeInitializationRate parameter to the block device mappings when you call run-instances command.

aws ec2 run-instances \
    --image-id ami-0abcdef1234567890 \
    --instance-type t2.micro \
    --subnet-id subnet-08fc749671b2d077c \
    --security-group-ids sg-0b0384b66d7d692f9 \
    --key-name MyKeyPair \
    --block-device-mappings file://mapping.json

Contents of mapping.json. This example adds /dev/sdh an empty EBS volume with a size of 8 GiB.

[
    {
        "DeviceName": "/dev/sdh",
        "Ebs": {
            "VolumeSize": 8,
            "VolumeType": "gp3",            
            "VolumeInitializationRate": 300
		 } 
     } 
]

To learn more, visit block device mapping options, which defines the EBS volumes and instance store volumes to attach to the instance at launch.

2. Create a volume from snapshots
When you create a volume from snapshots, you can also choose Create volume in the EC2 console and specify the Volume initialization rate.

Confirm your new volume with the initialization rate.

In the AWS CLI, you can use VolumeInitializationRate parameter and when calling create-volume command.

aws ec2 create-volume --region us-east-1 --cli-input-json '{
    "AvailabilityZone": "us-east-1a",
    "VolumeType": "gp3",
    "SnapshotId": "snap-07f411eed12ef613a",
    "VolumeInitializationRate": 300
}'

If the command is run successfully, you will receive the result below.

{
    "AvailabilityZone": "us-east-1a",
    "CreateTime": "2025-01-03T21:44:53.000Z",
    "Encrypted": false,
    "Size": 100,
    "SnapshotId": "snap-07f411eed12ef613a",
    "State": "creating",
    "VolumeId": "vol-0ba4ed2a280fab5f9",
    "Iops": 300,
    "Tags": [],
    "VolumeType": "gp2",
    "MultiAttachEnabled": false,
    "VolumeInitializationRate": 300
}

You can also set the volume initialization rate when replacing root volumes of EC2 instances and provisioning EBS volumes using the EBS Container Storage Interface (CSI) driver.

After creation of the volume, EBS will keep track of the hydration progress and publish an Amazon EventBridge notification for EBS to your account when the hydration completes so that they can be certain when their volume is fully performant.

To learn more, visit Create an Amazon EBS volume and Initialize Amazon EBS volumes in the Amazon EBS User Guide.

Now available
Amazon EBS Provisioned Rate for Volume Initialization is now available and supported for all EBS volume types today. You will be charged based on the full snapshot size and the specified volume initialization rate. To learn more, visit Amazon EBS Pricing page.

To learn more about Amazon EBS including this feature, take the free digital course on the AWS Skill Builder portal. Course includes use cases, architecture diagrams and demos.

Give this feature a try in the Amazon EC2 console today and send feedback to AWS re:Post for Amazon EBS or through your usual AWS Support contacts.

— Channy

How is the News Blog doing? Take this 1 minute survey!

I built an AI code review agent in a few hours, here's what I learned

2025-05-06T19:39:44.000Z

How to Harden GitHub Actions: The Unofficial Guide

2025-05-06T02:07:42.000Z

How to harden GitHub Actions

2025-05-06T02:07:42.000Z

Replacing Kubernetes with systemd (2024)

2025-05-05T20:40:14.000Z

Scaling with safety: Cloudflare's approach to global service health metrics and software releases

2025-05-05T13:20:57.162Z

Has your browsing experience ever been disrupted by this error page? Sometimes Cloudflare returns "Error 500" when our servers cannot respond to your web request. This inability to respond could have several potential causes, including problems caused by a bug in one of the services that make up Cloudflare's software stack.

We know that our testing platform will inevitably miss some software bugs, so we built guardrails to gradually and safely release new code before a feature reaches all users. Health Mediated Deployments (HMD) is Cloudflare’s data-driven solution to automating software updates across our global network. HMD works by querying Thanos, a system for storing and scaling Prometheus metrics. Prometheus collects detailed data about the performance of our services, and Thanos makes that data accessible across our distributed network. HMD uses these metrics to determine whether new code should continue to roll out, pause for further evaluation, or be automatically reverted to prevent widespread issues.

Cloudflare engineers configure signals from their service, such as alerting rules or Service Level Objectives (SLOs). For example, the following Service Level Indicator (SLI) checks the rate of HTTP 500 errors over 10 minutes returned from a service in our software stack.

sum(rate(http_request_count{code="500"}[10m])) / sum(rate(http_request_count[10m]))

An SLO is a combination of an SLI and an objective threshold. For example, the service returns 500 errors <0.1% of the time.

If the success rate is unexpectedly decreasing where the new code is running, HMD reverts the change in order to stabilize the system, reacting before humans even know what Cloudflare service was broken. Below, HMD recognizes the degradation in signal in an early release stage and reverts the code back to the prior version to limit the blast radius.

Cloudflare’s network serves millions of requests per second across diverse geographies. How do we know that HMD will react quickly the next time we accidentally release code that contains a bug? HMD performs a testing strategy called backtesting, outside the release process, which uses historical incident data to test how long it would take to react to degrading signals in a future release.

We use Thanos to join thousands of small Prometheus deployments into a single unified query layer while keeping our monitoring reliable and cost-efficient. To backfill historical incident metric data that has fallen out of Prometheus’ retention period, we use our object storage solution, R2.

Today, we store 4.5 billion distinct time series for a year of retention, which results in roughly 8 petabytes of data in 17 million objects distributed all over the globe.

Making it work at scale

To give a sense of scale, we can estimate the impact of a batch of backtests:

Each backtest run is made up of multiple SLOs to evaluate a service's health.
Each SLO is evaluated using multiple queries containing batches of data centers.
Each data center issues anywhere from tens to thousands of requests to R2.

Thus, in aggregate, a batch can translate to hundreds of thousands of PromQL queries and millions of requests to R2. Initially, batch runs would take about 30 hours to complete but through blood, sweat, and tears, we were able to cut this down to 2 hours.

Let’s review how we made this processing more efficient.

Recording rules

HMD slices our fleet of machines across multiple dimensions. For the purposes of this post, let’s refer to them as “tier” and “color”. Given a pair of tier and color, we would use the following PromQL expression to find the machines that make up this combination:

group by (instance, datacenter, tier, color) (
  up{job="node_exporter"}
  * on (datacenter) group_left(tier) datacenter_metadata{tier="tier3"}
  * on (instance) group_left(color) server_metadata{color="green"}
  unless on (instance) (machine_in_maintenance == 1)
  unless on (datacenter) (datacenter_disabled == 1)
)

Most of these series have a cardinality of approximately the number of machines in our fleet. That’s a substantial amount of data we need to fetch from object storage and transmit home for query evaluation, as well as a significant number of series we need to decode and join together.

Since this is a fairly common query that is issued in every HMD run, it makes sense to precompute it. In the Prometheus ecosystem, this is commonly done with recording rules:

hmd:release_scopes:info{tier="tier3", color="green"}

Aside from looking much cleaner, this also reduces the load at query time significantly. Since all the joins involved can only have matches within a data center, it is well-defined to evaluate those rules directly in the Prometheus instances inside the data center itself.

Compared to the original query, the cardinality we need to deal with now scales with the size of the release scope instead of the size of the entire fleet.

This is significantly cheaper and also less likely to be affected by network issues along the way, which in turn reduces the amount that we need to retry the query, on average.

Distributed query processing

HMD and the Thanos Querier, depicted above, are stateless components that can run anywhere, with highly available deployments in North America and Europe. Let us quickly recap what happens when we evaluate the SLI expression from HMD in our introduction:

sum(rate(http_request_count{code="500"}[10m]))
/ 
sum(rate(http_request_count[10m]))

Upon receiving this query from HMD, the Thanos Querier will start requesting raw time series data for the “http_requests_total” metric from its connected Thanos Sidecar and Thanos Store instances all over the world, wait for all the data to be transferred to it, decompress it, and finally compute its result:

While this works, it is not optimal for several reasons. We have to wait for raw data from thousands of data sources all over the world to arrive in one location before we can even start to decompress it, and then we are limited by all the data being processed by one instance. If we double the number of data centers, we also need to double the amount of memory we allocate for query evaluation.

Many SLIs come in the form of simple aggregations, typically to boil down some aspect of the service's health to a number, such as the percentage of errors. As with the aforementioned recording rule, those aggregations are often distributive — we can evaluate them inside the data center and coalesce the sub-aggregations again to arrive at the same result.

To illustrate, if we had a recording rule per data center, we could rewrite our example like this:

sum(datacenter:http_request_count:rate10m{code="500"})
/ 
sum(datacenter:http_request_count:rate10m)

This would solve our problems, because instead of requesting raw time series data for high-cardinality metrics, we would request pre-aggregated query results. Generally, these pre-aggregated results are an order of magnitude less data that needs to be sent over the network and processed into a final result.

However, recording rules come with a steep write-time cost in our architecture, evaluated frequently across thousands of Prometheus instances in production, just to speed up a less frequent ad-hoc batch process. Scaling recording rules alongside our growing set of service health SLIs quickly would be unsustainable. So we had to go back to the drawing board.

It would be great if we could evaluate data center-scoped queries remotely and coalesce their result back again — for arbitrary queries and at runtime. To illustrate, we would like to evaluate our example like this:

(sum(rate(http_requests_total{status="500", datacenter="dc1"}[10m])) + ...)
/
(sum(rate(http_requests_total{datacenter="dc1"}[10m])) + ...)

This is exactly what Thanos’ distributed query engine is capable of doing. Instead of requesting raw time series data, we request data center scoped aggregates and only need to send those back home where they get coalesced back again into the full query result:

Note that we ensure all the expensive data paths are as short as possible by utilizing R2 location hints to specify the primary access region.

To measure the effectiveness of this approach, we used Cloudprober and wrote probes that evaluate the relatively cheap, but still global, query count(node_uname_info).

sum(thanos_cloudprober_latency:rate6h{component="thanos-central"})
/
sum(thanos_cloudprober_latency:rate6h{component="thanos-distributed"})

In the graph below, the y-axis represents the speedup of the distributed execution deployment relative to the centralized deployment. On average, distributed execution responds 3–5 times faster to probes.

Anecdotally, even slightly more complex queries quickly time out or even crash our centralized deployment, but they still can be comfortably computed by the distributed one. For a slightly more expensive query like count(up) for about 17 million scrape jobs, we had difficulty getting the centralized querier to respond and had to scope it to a single region, which took about 42 seconds:

Meanwhile, our distributed queriers were able to return the full result in about 8 seconds:

Congestion control

HMD batch processing leads to spiky load patterns that are hard to provision for. In a perfect world, it would issue a steady and predictable stream of queries. At the same time, HMD batch queries have lower priority to us than the queries that on-call engineers issue to triage production problems. We tackle both of those problems by introducing an adaptive priority-based concurrency control mechanism. After reading Netflix’s work on adaptive concurrency limits, we implemented a similar proxy to dynamically limit batch request flow when Thanos SLOs start to degrade. For example, one such SLO is its cloudprober failure rate over the last minute:

sum(thanos_cloudprober_fail:rate1m)
/
(sum(thanos_cloudprober_success:rate1m) + sum(thanos_cloudprober_fail:rate1m))

We apply jitter, a random delay, to smooth query spikes inside the proxy. Since batch processing prioritizes overall query throughput over individual query latency, jitter helps HMD send a burst of queries, while allowing Thanos to process queries gradually over several minutes. This reduces instantaneous load on Thanos, improving overall throughput, even if individual query latency increases. Meanwhile, HMD encounters fewer errors, minimizing retries and boosting batch efficiency.

Our solution simulates how TCP’s congestion control algorithm, additive increase/multiplicative decrease, works. When the proxy server receives a successful request from Thanos, it allows one more concurrent request through next time. If backpressure signals breach defined thresholds, the proxy limits the congestion window proportional to the failure rate.

As the failure rate increases past the “warn” threshold, approaching the “emergency” threshold, the proxy gets exponentially closer to allowing zero additional requests through the system. However, to prevent bad signals from halting all traffic, we cap the loss with a configured minimum request rate.

Columnar experiments

Because Thanos deals with Prometheus TSDB blocks that were never designed for being read over a slow medium like object storage, it does a lot of random I/O. Inspired by this excellent talk, we started storing our time series data in Parquet files, with some promising preliminary results. This project is still too early to draw any robust conclusions, but we wanted to share our implementation with the Prometheus community, so we are publishing our experimental object storage gateway as parquet-tsdb-poc on GitHub.

Conclusion

We built Health Mediated Deployments (HMD) to enable safe and reliable software releases while pushing the limits of our observability infrastructure. Along the way, we significantly improved Thanos’ ability to handle high-load queries, reducing batch runtimes by 15x.

But this is just the beginning. We’re excited to continue working with the observability, resiliency, and R2 teams to push our infrastructure to its limits — safely and at scale. As we explore new ways to enhance observability, one exciting frontier is optimizing time series storage for object storage.

We’re sharing this work with the community as an open-source proof of concept. If you’re interested in exploring Parquet-based time series storage and its potential for large-scale observability, check out the GitHub project linked above.

Building Burstables: CPU slicing with cgroups

2025-05-02T16:45:29.000Z

Suno v4.5

2025-05-02T13:18:26.000Z

Claude can now connect to your world

2025-05-01T16:02:14.000Z

minidisc: Zero-config service discovery for Tailscale networks

2025-05-01T14:53:45.000Z

DeepSeek-Prover-V2

2025-04-30T16:23:28.000Z

Show HN: Neurox – GPU Observability for AI Infra

2025-04-29T18:02:03.000Z

Jepsen: Amazon RDS for PostgreSQL 17.4

2025-04-29T14:30:11.000Z

Mission Impossible: Managing AI Agents in the Real World

2025-04-29T13:54:25.000Z

Is outbound going to die?

2025-04-28T17:28:09.000Z

Show HN: Logchef – Schema-agnostic log viewer for ClickHouse

2025-04-27T15:15:40.000Z

Easily setup new MacBooks

2025-04-25T09:57:46.000Z

Instant SQL for results as you type in DuckDB UI

2025-04-24T13:23:26.000Z

The Future of MCPs

2025-04-23T17:12:58.000Z

Show HN: Moose – OSS framework to build analytical back ends with ClickHouse

2025-04-23T16:41:23.000Z

AI Horseless Carriages

2025-04-23T16:19:56.000Z

MCP on AWS Lambda with MCPEngine

2025-04-23T16:17:04.000Z

Atuin Desktop: Runbooks That Run

2025-04-22T20:54:52.000Z

Sapphire: Rust based package manager for macOS

2025-04-22T18:39:20.000Z

ClickHouse gets lazier (and faster): Introducing lazy materialization

2025-04-22T16:03:32.000Z

CI/CD Security Best Practices

2025-04-22T12:57:38.000Z

Software development moves fast – really fast. It can also involve multiple teams working from different locations around the world. However, while speed and collaboration can be great for developers and businesses, they can also create security challenges.

With more entry points and less time to catch potential threats, each commit, build, and deployment is another opportunity for something to go wrong. Whether that’s a security breach, malicious attack, or accidental exposure, the impact can ripple through your chain and burden every application.

That’s where CI/CD security comes in. Learn what securing your CI/CD pipeline means for your team, the main risks you need to be aware of, and the practical steps to safeguard your flow.

What is CI/CD security, and why is it important?

CI/CD security is a set of practices and controls that protects the entire software delivery process. It prioritizes keeping your code safe from the very start, is built in rather than a separate phase, and is integral to DevSecOps.

Your CI/CD pipeline has access to tons of sensitive information, including codebases, credentials, and production environments. If compromised, attackers could inject malicious code, steal data, or even gain access to your systems (as they did in the SolarWinds attack).

Aside from these catastrophic breaches, proper CI/CD security helps prevent mistakes, which could expose sensitive data or introduce vulnerabilities. Malicious employee or contractor behavior shouldn’t be overlooked here, either – 20% of businesses cited this as a cause of their data breaches. CI/CD security is both a shield and a safety net in one.

With development automation, changes can go from a laptop to production in minutes, and CI/CD security needs to ensure it doesn’t slow down the process. Acceleration is great for business agility, but giving attackers a fast track to your systems is hazardous. In fact, less than 10% of companies in 2022 had implemented hack monitoring in their software development lifecycle.

However, get CI/CD security right, and you can have both speed and reliability.

CI/CD pipeline security threats

Your CI/CD pipeline has several potential weak points, including:

Source-code repositories: Where your application code and configuration files live, the starting point of your pipeline.

Build servers: The systems that compile your code, run tests, and package your applications. They handle sensitive operations and often have elevated privileges.

Artifact storage: Where your compiled applications, container images, and packages are stored before deployment.
Deployment environments: The staging and production systems where your applications run (including cloud platforms and traditional servers).

These components face threats from various angles, such as:

Supply chain attacks: Harmful code can sneak in through compromised third-party tools, libraries, or dependencies used in your application.

Stolen passwords and secrets: Attackers may find exposed credentials in pipeline configurations or scripts. These threats can take a long time to identify and contain – 292 days, according to one report.

Configuration mistakes: Small errors in setup can enable attackers to bypass security or gain more access than they should have.

Insider threats: Developers with pipeline access might accidentally or intentionally introduce vulnerabilities.
Server breaches: Attackers can get access to the computers that run your build and deployment process.

The interconnected nature of CI/CD means that compromising just one part can affect everything in the system.

Tips for securing your CI/CD pipeline

The most effective CI/CD security involves building multiple layers of protection throughout your pipeline. Rather than implementing a single tool or simply following a checklist, you should set up security checkpoints at every stage.

Employ CI/CD access controls

Protect your pipeline by implementing strict access controls and applying the principle of least privilege.

Use role-based access control (RBAC) to ensure team members only have the access they absolutely need for their specific roles. To prevent unauthorized code changes, set up mandatory code reviews, enable branch protection rules, and use signed commits.

Remember to regularly audit these permissions and remove access when team members leave.

Manage secrets effectively

Never, ever hardcode credentials into your pipeline configurations or code. Instead, use dedicated secrets management tools (such as HashiCorp Vault) to securely store and manage sensitive information.

Rotate these credentials regularly (ideally automatically) and ensure secrets are encrypted both in transit and at rest. It’s also best to use temporary credentials where possible.

Integrated security testing

Make security testing a natural part of your pipeline by putting multiple testing layers in place.

Certain tools can help you catch vulnerabilities before they reach production:

Static Application Security Testing (SAST) analyzes your source code for security vulnerabilities.
Dynamic Application Security Testing (DAST) tests running applications.
Interactive Application Security Testing (IAST) is used for runtime analysis.
Software Composition Analysis (SCA) checks third-party dependencies.

Configure these tests to run automatically with each build and block deployments if any security issues are found.

Secure the development and deployment environment

Ensure your build environments are as secure as your production systems – they’re just as important, if not more.

Harden your build servers by removing unnecessary services, keeping systems patched, and using minimal base images. Implement network segmentation to isolate build environments from each other and other systems.

If you can, consider using temporary infrastructure. This method allows you to create fresh environments for each build and destroy them afterward.

Automate security scans

Set up automated security scanning throughout your pipeline. Use container scanners to check for vulnerabilities in container images, dependency checkers to identify known vulnerabilities in libraries, and registry scanners to ensure the security of stored artifacts.

Establish vulnerability thresholds (what level is considered suspicious or a threat) and automatically stop deployments that don’t meet your security standards. Schedule regular scans of your artifacts to ensure you’re aware of new or emerging vulnerabilities.

Monitor and alert

Implement comprehensive monitoring for your CI/CD pipeline. Track all activities and watch for unusual patterns like builds at odd hours, unexpected configuration changes, strange resource usage, and deployment events.

Use detailed logging and set up alerts, making sure your team knows how to respond if something suspicious is found. Security information and event management (SIEM) are great CI/CD security tools – they correlate security events and enable real-time threat detection and response.

Perform regular security audits and assessments

Regularly test your CI/CD pipeline security using different methods:

Penetration testing identifies potential vulnerabilities before attackers do.
Red team exercises simulate ‘real’ attacks, while blue team exercises let you practice your incident response.
Purple team exercises are used to improve both your offensive and defensive capabilities.

Check your compliance with your local security standards and regulations, and update your controls based on the results of your assessments.

How TeamCity can help

Security in your CI/CD pipeline is a must for protecting your software supply chain. While the threats are real, with the right tools and practices, you can build and deploy software securely without slowing down your team or minimizing their efforts.

TeamCity makes this easier with security features that grow with your needs.

TeamCity On-Premises	TeamCity Cloud
🖥️ Installed and fully managed by your team	☁️ Hosted and managed by JetBrains
🔐 Full control over infrastructure and network	🔒 Zero-maintenance, secure-by-default CI/CD environment
🗝️ SSH key management	🗝️ SSH key management
🔄 Custom secrets management integrations (e.g., HashiCorp Vault, AWS KMS)	🔄 Custom secrets management integrations (e.g., HashiCorp Vault, AWS KMS)
📦 Artifact storage and access managed internally	📦 Secure artifact storage with access control
🔍 Customizable logging and monitoring tools	📜 Built-in user audit logs and integrated monitoring
🔧 Highly customizable for specific compliance needs	✅ Compliant with industry standards and suitable for regulated industries
👥 Ideal for teams with strict infrastructure or data residency policies	🏢 Great for teams who want secure CI/CD without infrastructure management

Deliver secure software without compromising on speed or performance. Try TeamCity for free now.

Event-Hidden Architectures

2025-04-22T12:53:45.000Z

Claude Code Best Practices

2025-04-19T10:48:30.000Z

Achieveing lower latencies with S3 object storage

2025-04-19T10:19:49.000Z

I gave up on self-hosted Sentry

2025-04-18T07:24:10.000Z

Show HN: Plandex v2 – open source AI coding agent for large projects and tasks

2025-04-16T21:26:42.000Z

OpenAI in Talks to Buy Windsurf for About $3B

2025-04-16T18:24:25.000Z

The case of the UI thread that hung in a kernel call

2025-04-15T17:13:31.000Z

Launch HN: Mrge.io (YC X25) – Cursor for code review

2025-04-15T13:34:21.000Z

Launch HN: mrge.io (YC X25) – Cursor for code review

2025-04-15T13:34:21.000Z

GitHub suffers a cascading supply chain attack compromising CI/CD secrets

2025-04-15T11:31:26.000Z

Show HN: Zero-codegen, no-compile TypeScript type inference from Protobufs

2025-04-14T15:41:03.000Z

Meilisearch – search engine API bringing AI-powered hybrid search

2025-04-14T12:46:45.000Z

Local CI. Sign off on your own work

2025-04-14T01:12:44.000Z

Dev Tools Honeytrap: Why We Can't Stop Building Tools Nobody Buys

2025-04-11T23:37:45.000Z

Firecracker Entropy for VM Clones

2025-04-11T22:46:25.000Z

Erlang's not about lightweight processes and message passing

2025-04-11T15:50:49.000Z

Erlang's not about lightweight processes and message passing (2023)

2025-04-11T15:50:49.000Z

Announcing up to 85% price reductions for Amazon S3 Express One Zone

2025-04-10T21:04:19.000Z

At re:Invent 2023, we introduced Amazon S3 Express One Zone, a high-performance, single-Availability Zone (AZ) storage class purpose-built to deliver consistent single-digit millisecond data access for your most frequently accessed data and latency-sensitive applications.

S3 Express One Zone delivers data access speed up to 10 times faster than S3 Standard, and it can support up to 2 million GET transactions per second (TPS) and up to 200,000 PUT TPS per directory bucket. This makes it ideal for performance-intensive workloads such as interactive data analytics, data streaming, media rendering and transcoding, high performance computing (HPC), and AI/ML trainings. Using S3 Express One Zone, customers like Fundrise, Aura, Lyrebird, Vivian Health, and Fetch improved the performance and reduced the costs of their data-intensive workloads.

Since launch, we’ve introduced a number of features for our customers using S3 Express One Zone. For example, S3 Express One Zone started to support object expiration using S3 Lifecycle to expire objects based on age to help you automatically optimize storage costs. In addition, your log-processing or media-broadcasting applications can directly append new data to the end of existing objects and then immediately read the object, all within S3 Express One Zone.

Today we’re announcing that, effective April 10, 2025, S3 Express One Zone has reduced storage prices by 31 percent, PUT request prices by 55 percent, and GET request prices by 85 percent. In addition, S3 Express One Zone has reduced the per-GB charges for data uploads and retrievals by 60 percent, and these charges now apply to all bytes transferred rather than just portions of requests greater than 512 KB.

Here is a price reduction table in the US East (N. Virginia) Region:

Price	Previous	New	Price reduction
Storage (per GB-Month)	$0.16	$0.11	31%
Writes (`PUT` requests)	$0.0025 per 1,000 requests up to 512 KB	$0.00113 per 1,000 requests	55%
Reads (`GET` requests)	$0.0002 per 1,000 requests up to 512 KB	$0.00003 per 1,000 requests	85%
Data upload (per GB)	$0.008	$0.0032	60%
Data retrievals (per GB)	$0.0015	$0.0006	60%

For S3 Express One Zone pricing examples, go to the S3 billing FAQs or use the AWS Pricing Calculator.

These pricing reductions apply to S3 Express One Zone in all AWS Regions where the storage class is available: US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Tokyo), Europe (Ireland), and Europe (Stockholm) Regions. To learn more, visit the Amazon S3 pricing page and S3 Express One Zone in the AWS Documentation.

Give S3 Express One Zone a try in the S3 console today and send feedback to AWS re:Post for Amazon S3 or through your usual AWS Support contacts.

— Channy

How is the News Blog doing? Take this 1 minute survey!

Cloudflare R2 Data Catalog: Managed Apache Iceberg tables with zero egress fees

2025-04-10T14:30:22.000Z

Docker Model Runner

2025-04-10T14:10:28.000Z

Show HN: Aqua Voice 2 – Fast Voice Input for Mac and Windows

2025-04-09T16:31:06.000Z

How Netflix Accurately Attributes eBPF Flow Logs

2025-04-08T18:21:49.000Z

Show HN: Coroot – eBPF-based, open source observability with actionable insights

2025-04-08T16:49:40.000Z

Any program can be a GitHub Actions shell

2025-04-08T01:20:10.000Z

Beautiful CI for Bazel

2025-04-07T22:34:22.000Z

Severance: A closer look into the mid-century, brutalist universe of Lumon

2025-04-07T19:40:39.000Z

Piecing together the Agent puzzle: MCP, authentication & authorization, and Durable Objects free tier

2025-04-07T13:12:21.944Z

It’s not a secret that at Cloudflare we are bullish on the future of agents. We’re excited about a future where AI can not only co-pilot alongside us, but where we can actually start to delegate entire tasks to AI.

While it hasn’t been too long since we first announced our Agents SDK to make it easier for developers to build agents, building towards an agentic future requires continuous delivery towards this goal. Today, we’re making several announcements to help accelerate agentic development, including:

New Agents SDK capabilities: Build remote MCP clients, with transport and authentication built-in, to allow AI agents to connect to external services.
BYO Auth provider for MCP: Integrations with Stytch, Auth0, and WorkOS to add authentication and authorization to your remote MCP server.
Hibernation for McpAgent: Automatically sleep stateful, remote MCP servers when inactive and wake them when needed. This allows you to maintain connections for long-running sessions while ensuring you’re not paying for idle time.
Durable Objects free tier: We view Durable Objects as a key component for building agents, and if you’re using our Agents SDK, you need access to it. Until today, Durable Objects was only accessible as part of our paid plans, and today we’re excited to include it in our free tier.
Workflows GA: Enables you to ship production-ready, long-running, multi-step actions in agents.
AutoRAG: Helps you integrate context-aware AI into your applications, in just a few clicks
agents.cloudflare.com: our new landing page for all things agents.

New MCP capabilities in Agents SDK

AI agents can now connect to and interact with external services through MCP (Model Context Protocol). We’ve updated the Agents SDK to allow you to build a remote MCP client into your AI agent, with all the components — authentication flows, tool discovery, and connection management — built-in for you.

This allows you to build agents that can:

Prompt the end user to grant access to a 3rd party service (MCP server).
Use tools from these external services, acting on behalf of the end user.
Call MCP servers from Workflows, scheduled tasks, or any part of your agent.
Connect to multiple MCP servers and automatically discover new tools or capabilities presented by the 3rd party service.

MCP (Model Context Protocol) — first introduced by Anthropic — is quickly becoming the standard way for AI agents to interact with external services, with providers like OpenAI, Cursor, and Copilot adopting the protocol.

We recently announced support for building remote MCP servers on Cloudflare, and added an McpAgent class to our Agents SDK that automatically handles the remote aspects of MCP: transport and authentication/authorization. Now, we’re excited to extend the same capabilities to agents acting as MCP clients.

Want to see it in action? Use the button below to deploy a fully remote MCP client that can be used to connect to remote MCP servers.

AI Agents can now act as remote MCP clients, with transport and auth included

AI agents need to connect to external services to access tools, data, and capabilities beyond their built-in knowledge. That means AI agents need to be able to act as remote MCP clients, so they can connect to remote MCP servers that are hosting these tools and capabilities.

We’ve added a new class, MCPClientManager, into the Agents SDK to give you all the tooling you need to allow your AI agent to make calls to external services via MCP. The MCPClientManager class automatically handles:

Transport: Connect to remote MCP servers over SSE and HTTP, with support for Streamable HTTP coming soon.
Connection management: The client tracks the state of all connections and automatically reconnects if a connection is lost.
Capability discovery: Automatically discovers all capabilities, tools, resources, and prompts presented by the MCP server.
Real-time updates: When a server's tools, resources, or prompts change, the client automatically receives notifications and updates its internal state.
Namespacing: When connecting to multiple MCP servers, all tools and resources are automatically namespaced to avoid conflicts.

Granting agents access to tools with built-in auth check for MCP Clients

We've integrated the complete OAuth authentication flow directly into the Agents SDK, so your AI agents can securely connect and authenticate to any remote MCP server without you having to build authentication flow from scratch.

This allows you to give users a secure way to log in and explicitly grant access to allow the agent to act on their behalf by automatically:

Supporting the OAuth 2.1 protocol.
Redirecting users to the service’s login page.
Generating the code challenge and exchanging an authorization code for an access token.
Using the access token to make authenticated requests to the MCP server.

Here is an example of an agent that can securely connect to MCP servers by initializing the client manager, adding the server, and handling the authentication callbacks:

async onStart(): Promise {
  // initialize MCPClientManager which manages multiple MCP clients with optional auth
  this.mcp = new MCPClientManager("my-agent", "1.0.0", {
    baseCallbackUri: `${serverHost}/agents/${agentNamespace}/${this.name}/callback`,
    storage: this.ctx.storage,
  });
}

async addMcpServer(url: string): Promise {
  // Add one MCP client to our MCPClientManager
  const { id, authUrl } = await this.mcp.connect(url);
  // Return authUrl to redirect the user to if the user is unauthorized
  return authUrl
}

async onRequest(req: Request): Promise {
  // handle the auth callback after being finishing the MCP server auth flow
  if (this.mcp.isCallbackRequest(req)) {
    await this.mcp.handleCallbackRequest(req);
    return new Response("Authorized")
  }
  
  // ...
}

Connecting to multiple MCP servers and discovering what capabilities they offer

You can use the Agents SDK to connect an MCP client to multiple MCP servers simultaneously. This is particularly useful when you want your agent to access and interact with tools and resources served by different service providers.

The MCPClientManager class maintains connections to multiple MCP servers through the mcpConnections object, a dictionary that maps unique server names to their respective MCPClientConnection instances.

When you register a new server connection using connect(), the manager:

Creates a new connection instance with server-specific authentication.
Initializes the connections and registers for server capability notifications.

async onStart(): Promise {
  // Connect to an image generation MCP server
  await this.mcp.connect("https://image-gen.example.com/mcp/sse");
  
  // Connect to a code analysis MCP server
  await this.mcp.connect("https://code-analysis.example.org/sse");
  
  // Now we can access tools with proper namespacing
  const allTools = this.mcp.listTools();
  console.log(`Total tools available: ${allTools.length}`);
}

Each connection manages its own authentication context, allowing one AI agent to authenticate to multiple servers simultaneously. In addition, MCPClientManager automatically handles namespacing to prevent collisions between tools with identical names from different servers.

For example, if both an “Image MCP Server” and “Code MCP Server” have a tool named “analyze”, they will both be independently callable without any naming conflicts.

Use Stytch, Auth0, and WorkOS to bring authentication & authorization to your MCP server

With MCP, users will have a new way of interacting with your application, no longer relying on the dashboard or API as the entrypoint. Instead, the service will now be accessed by AI agents that are acting on a user’s behalf. To ensure users and agents can connect to your service securely, you’ll need to extend your existing authentication and authorization system to support these agentic interactions, implementing login flows, permissions scopes, consent forms, and access enforcement for your MCP server.

We’re adding integrations with Stytch, Auth0, and WorkOS to make it easier for anyone building an MCP server to configure authentication & authorization for their MCP server.

You can leverage our MCP server integration with Stytch, Auth0, and WorkOS to:

Allow users to authenticate to your MCP server through email, social logins, SSO (single sign-on), and MFA (multi-factor authentication).
Define scopes and permissions that directly map to your MCP tools.
Present users with a consent page corresponding with the requested permissions.

Enforce the permissions so that agents can only invoke permitted tools.

Get started with the examples below by using the “Deploy to Cloudflare” button to deploy the demo MCP servers in your Cloudflare account. These demos include pre-configured authentication endpoints, consent flows, and permission models that you can tailor to fit your needs. Once you deploy the demo MCP servers, you can use the Workers AI playground, a browser-based remote MCP client, to test out the end-to-end user flow.

Stytch

Get started with a remote MCP server that uses Stytch to allow users to sign in with email, Google login or enterprise SSO and authorize their AI agent to view and manage their company’s OKRs on their behalf. Stytch will handle restricting the scopes granted to the AI agent based on the user’s role and permissions within their organization. When authorizing the MCP Client, each user will see a consent page that outlines the permissions that the agent is requesting that they are able to grant based on their role.

For more consumer use cases, deploy a remote MCP server for a To Do app that uses Stytch for authentication and MCP client authorization. Users can sign in with email and immediately access the To Do lists associated with their account, and grant access to any AI assistant to help them manage their tasks.

Regardless of use case, Stytch allows you to easily turn your application into an OAuth 2.0 identity provider and make your remote MCP server into a Relying Party so that it can easily inherit identity and permissions from your app. To learn more about how Stytch is enabling secure authentication to remote MCP servers, read their blog post.

“One of the challenges of realizing the promise of AI agents is enabling those agents to securely and reliably access data from other platforms. Stytch Connected Apps is purpose-built for these agentic use cases, making it simple to turn your app into an OAuth 2.0 identity provider to enable secure access to remote MCP servers. By combining Cloudflare Workers with Stytch Connected Apps, we're removing the barriers for developers, enabling them to rapidly transition from AI proofs-of-concept to secure, deployed implementations.” — Julianna Lamb, Co-Founder & CTO, Stytch.

Auth0

Get started with a remote MCP server that uses Auth0 to authenticate users through email, social logins, or enterprise SSO to interact with their todos and personal data through AI agents. The MCP server securely connects to API endpoints on behalf of users, showing exactly which resources the agent will be able to access once it gets consent from the user. In this implementation, access tokens are automatically refreshed during long running interactions.

To set it up, first deploy the protected API endpoint:

Then, deploy the MCP server that handles authentication through Auth0 and securely connects AI agents to your API endpoint.

"Cloudflare continues to empower developers building AI products with tools like AI Gateway, Vectorize, and Workers AI. The recent addition of Remote MCP servers further demonstrates that Cloudflare Workers and Durable Objects are a leading platform for deploying serverless AI. We’re very proud that Auth0 can help solve the authentication and authorization needs for these cutting-edge workloads." — Sandrino Di Mattia, Auth0 Sr. Director, Product Architecture.

WorkOS

Get started with a remote MCP server that uses WorkOS's AuthKit to authenticate users and manage the permissions granted to AI agents. In this example, the MCP server dynamically exposes tools based on the user's role and access rights. All authenticated users get access to the add tool, but only users who have been assigned the image_generation permission in WorkOS can grant the AI agent access to the image generation tool. This showcases how MCP servers can conditionally expose capabilities to AI agents based on the authenticated user's role and permission.

“MCP is becoming the standard for AI agent integration, but authentication and authorization are still major gaps for enterprise adoption. WorkOS Connect enables any application to become an OAuth 2.0 authorization server, allowing agents and MCP clients to securely obtain tokens for fine-grained permission authorization and resource access. With Cloudflare Workers, developers can rapidly deploy remote MCP servers with built-in OAuth and enterprise-grade access control. Together, WorkOS and Cloudflare make it easy to ship secure, enterprise-ready agent infrastructure.” — Michael Grinich, CEO of WorkOS.

Hibernate-able WebSockets: put AI agents to sleep when they’re not in use

Starting today, a new improvement is landing in the McpAgent class: support for the WebSockets Hibernation API that allows your MCP server to go to sleep when it’s not receiving requests and instantly wake up when it’s needed. That means that you now only pay for compute when your agent is actually working.

We recently introduced the McpAgent class, which allows developers to build remote MCP servers on Cloudflare by using Durable Objects to maintain stateful connections for every client session. We decided to build McpAgent to be stateful from the start, allowing developers to build servers that can remember context, user preferences, and conversation history. But maintaining client connections means that the session can remain active for a long time, even when it’s not being used.

MCP Agents are hibernate-able by default

You don’t need to change your code to take advantage of hibernation. With our latest SDK update, all McpAgent instances automatically include hibernation support, allowing your stateful MCP servers to sleep during inactive periods and wake up with their state preserved when needed.

How it works

When a request comes in on the Server-Sent Events endpoint, /sse, the Worker initializes a WebSocket connection to the appropriate Durable Object for the session and returns an SSE stream back to the client. All responses flow over this stream.

The implementation leverages the WebSocket Hibernation API within Durable Objects. When periods of inactivity occur, the Durable Object can be evicted from memory while keeping the WebSocket connection open. If the WebSocket later receives a message, the runtime recreates the Durable Object and delivers the message to the appropriate handler.

Durable Objects on free tier

To help you build AI agents on Cloudflare, we’re making Durable Objects available on the free tier, so you can start with zero commitment. With Agents SDK, your AI agents deploy to Cloudflare running on Durable Objects.

Durable Objects offer compute alongside durable storage, that when combined with Workers, unlock stateful, serverless applications. Each Durable Object is a stateful coordinator for handling client real-time interactions, making requests to external services like LLMs, and creating agentic “memory” through state persistence in zero-latency SQLite storage — all tasks required in an AI agent. Durable Objects scale out to millions of agents effortlessly, with each agent created near the user interacting with their agent for fast performance, all managed by Cloudflare.

Zero-latency SQLite storage in Durable Objects was introduced in public beta September 2024 for Birthday Week. Since then, we’ve focused on missing features and robustness compared to pre-existing key-value storage in Durable Objects. We are excited to make SQLite storage generally available, with a 10 GB SQLite database per Durable Object, and recommend SQLite storage for all new Durable Object classes. Durable Objects free tier can only access SQLite storage.

Cloudflare’s free tier allows you to build real-world applications. On the free plan, every Worker request can call a Durable Object. For usage-based pricing, Durable Objects incur compute and storage usage with the following free tier limits.

	Workers Free	Workers Paid
Compute: Requests	100,000 / day	1 million / month included + $0.15 / million
Compute: Duration	13,000 GB-s / day	400,000 GB-s / month included + $12.50 / million GB-s
Storage: Rows read	5 million / day	25 billion / month included + $0.001 / million
Storage: Rows written	100,000 / day	50 million / month included + $1.00 / million
Storage: SQL stored data	5 GB (total)	5 GB-month included + $0.20 / GB-month

Find us at agents.cloudflare.com

We realize this is a lot of information to take in, but don’t worry. Whether you’re new to agents as a whole, or looking to learn more about how Cloudflare can help you build agents, today we launched a new site to help get you started — agents.cloudflare.com.

Let us know what you build!

We Designed TigerBeetle's Docs from Scratch

2025-04-07T12:35:03.000Z

The next generation of Bazel builds

2025-04-06T13:40:56.000Z

The order of files in your ext4 filesystem does not matter

2025-04-06T12:39:40.000Z

Done with GitHub Actions Supply Chain Attacks

2025-04-06T07:07:44.000Z

The curious case of binfmt for x86 emulation for ARM Docker

2025-04-04T06:28:53.000Z

AWS MCP Servers

2025-04-03T11:24:10.000Z

Introducing Nix Ninja – open-source Ninja-compatible build system for Nix

2025-04-03T11:10:32.000Z

Show HN: Nix Ninja – open-source Ninja-compatible build system for Nix

2025-04-03T10:47:11.000Z

Show HN: Mermaid Chart VS Code Plugin: Mermaid.js Diagrams in Visual Studio Code

2025-04-02T16:33:53.000Z

Bare: Run JavaScript Everywhere

2025-04-02T16:18:21.000Z

Sparks – A typeface for creating sparklines in text without code

2025-04-02T06:44:17.000Z

Apple enables RCS messaging for Google Fi subscribers at last

2025-04-01T20:22:53.000Z

Apple spent years ignoring RCS, allowing iPhones to offer a degraded messaging experience with Android users. This made Android folks unwelcome in many a group chat, but Apple finally started rectifying this issue last year with the addition of RCS support in iOS. It has been a slow rollout, though, with Google's mobile service only now getting support.

While Apple supports RCS messaging on iPhones now, it has not exactly been enthusiastic about it. Anyone using Google Fi on an iPhone was left in the lurch even after Apple changed course. The first RCS update rolled out in iOS 18 last fall, but it only supported postpaid plans on the big three carriers. Most other wireless subscribers had to wait, including those on Google Fi, as confirmed to Ars last year. It was a suitably amusing outcome, considering Google is largely responsible for reviving the RCS standard and runs the Jibe back-end servers through which many iPhone RCS messages flow.

Slowly but surely, Apple is making good on its promises to enable RCS as it gets the necessary data from carriers. The company released iOS 18.4 this week, and hiding amid the control center tweaks and priority notifications is support for RCS on Google Fi and other T-Mobile MVNOs. Some users spotted this feature in the recent beta releases, but the servers that handle RCS for Google's mobile service were not yet connectable. With the final release, Google has confirmed that RCS is ready at last.

Systems Correctness Practices at AWS: Leveraging Formal and Semi-Formal Methods

2025-04-01T14:59:42.000Z

Amazon introduces Nova Chat, entering the arena with ChatGPT, Claude, Grok

2025-03-31T14:36:25.000Z

Vibe Coding with Cursor

2025-03-29T22:52:55.000Z

Paged Out #6 is out

2025-03-29T18:12:03.000Z

Disk I/O bottlenecks in GitHub Actions

2025-03-28T15:22:36.000Z

It's five grand a day to miss our S3 exit

2025-03-27T02:02:40.000Z

Building a Linux Container Runtime from Scratch

2025-03-26T20:35:46.000Z

Building a Firecracker-Powered Course Platform to Learn Docker and Kubernetes

2025-03-26T20:08:32.000Z

Playwright Tools for MCP

2025-03-26T19:07:39.000Z

Whose code am I running in GitHub Actions?

2025-03-25T17:17:05.000Z

Show HN: We made an MCP Server so that Cursor can build anything from API Docs

2025-03-24T10:21:02.000Z

Ttyd – Share your terminal over the web

2025-03-23T17:26:11.000Z

Ask HN: What is the simplest data orchestration tool you've worked with?

2025-03-21T19:24:54.000Z

Bigscreen Beyond 2

2025-03-21T17:03:27.000Z

What Comes After GitHub Actions?

2025-03-21T16:26:34.000Z

HTTPS-only for Cloudflare APIs: shutting the door on cleartext traffic

2025-03-20T13:00:00.000Z

Connections made over cleartext HTTP ports risk exposing sensitive information because the data is transmitted unencrypted and can be intercepted by network intermediaries, such as ISPs, Wi-Fi hotspot providers, or malicious actors on the same network. It’s common for servers to either redirect or return a 403 (Forbidden) response to close the HTTP connection and enforce the use of HTTPS by clients. However, by the time this occurs, it may be too late, because sensitive information, such as an API token, may have already been transmitted in cleartext in the initial client request. This data is exposed before the server has a chance to redirect the client or reject the connection.

A better approach is to refuse the underlying cleartext connection by closing the network ports used for plaintext HTTP, and that’s exactly what we’re going to do for our customers.

Today we’re announcing that we’re closing all of the HTTP ports on api.cloudflare.com. We’re also making changes so that api.cloudflare.com can change IP addresses dynamically, in line with on-going efforts to decouple names from IP addresses, and reliably managing addresses in our authoritative DNS. This will enhance the agility and flexibility of our API endpoint management. Customers relying on static IP addresses for our API endpoints will be notified in advance to prevent any potential availability issues.

In addition to taking this first step to secure Cloudflare API traffic, we’ll release the ability for customers to opt-in to safely disabling all HTTP port traffic for their websites on Cloudflare. We expect to make this free security feature available in the last quarter of 2025.

We have consistently advocated for strong encryption standards to safeguard users’ data and privacy online. As part of our ongoing commitment to enhancing Internet security, this blog post details our efforts to enforce HTTPS-only connections across our global network.

Understanding the problem

We already provide an “Always Use HTTPS” setting that can be used to redirect all visitor traffic on our customers’ domains (and subdomains) from HTTP (plaintext) to HTTPS (encrypted). For instance, when a user clicks on an HTTP version of the URL on the site (http://www.example.com), we issue an HTTP 3XX redirection status code to immediately redirect the request to the corresponding HTTPS version (https://www.example.com) of the page. While this works well for most scenarios, there’s a subtle but important risk factor: What happens if the initial plaintext HTTP request (before the redirection) contains sensitive user information?

^{Initial plaintext HTTP request is exposed to the network before the server can redirect to the secure HTTPS connection.}

Third parties or intermediaries on shared networks could intercept sensitive data from the first plaintext HTTP request, or even carry out a Monster-in-the-Middle (MITM) attack by impersonating the web server.

One may ask if HTTP Strict Transport Security (HSTS) would partially alleviate this concern by ensuring that, after the first request, visitors can only access the website over HTTPS without needing a redirect. While this does reduce the window of opportunity for an adversary, the first request still remains exposed. Additionally, HSTS is not applicable by default for most non-user-facing use cases, such as API traffic from stateless clients. Many API clients don’t retain browser-like state or remember HSTS headers they've encountered. It is quite common practice for API calls to be redirected from HTTP to HTTPS, and hence have their initial request exposed to the network.

Therefore, in line with our culture of dogfooding, we evaluated the accessibility of the Cloudflare API (api.cloudflare.com) over HTTP ports (80, and others). In that regard, imagine a client making an initial request to our API endpoint that includes their secret API key. While we outright reject all plaintext connections with a 403 Forbidden response instead of redirecting for API traffic — clearly indicating that “Cloudflare API is only accessible over TLS” — this rejection still happens at the application layer. By that point, the API key may have already been exposed over the network before we can even reject the request. We do have a notification mechanism in place to alert customers and rotate their API keys accordingly, but a stronger approach would be to eliminate the exposure entirely. We have an opportunity to improve!

A better approach to API security

Any API key or token exposed in plaintext on the public Internet should be considered compromised. We can either address exposure after it occurs or prevent it entirely. The reactive approach involves continuously tracking and revoking compromised credentials, requiring active management to rotate each one. For example, when a plaintext HTTP request is made to our API endpoints, we detect exposed tokens by scanning for 'Authorization' header values.

In contrast, a preventive approach is stronger and more effective, stopping exposure before it happens. Instead of relying on the API service application to react after receiving potentially sensitive cleartext data, we can preemptively refuse the underlying connection at the transport layer, before any HTTP or application-layer data is exchanged. The preventative approach can be achieved by closing all plaintext HTTP ports for API traffic on our global network. The added benefit is that this is operationally much simpler: by eliminating cleartext traffic, there's no need for key rotation.

^{The transport layer carries the application layer data on top.}

To explain why this works: an application-layer request requires an underlying transport connection, like TCP or QUIC, to be established first. The combination of a port number and an IP address serves as a transport layer identifier for creating the underlying transport channel. Ports direct network traffic to the correct application-layer process — for example, port 80 is designated for plaintext HTTP, while port 443 is used for encrypted HTTPS. By disabling the HTTP cleartext server-side port, we prevent that transport channel from being established during the initial "handshake" phase of the connection — before any application data, such as a secret API key, leaves the client’s machine.

^{Both TCP and QUIC transport layer handshakes are a pre-requisite for HTTPS application data exchange on the web.}

Therefore, closing the HTTP interface entirely for API traffic gives a strong and visible fast-failure signal to developers that might be mistakenly accessing http://… instead of https://… with their secret API keys in the first request — a simple one-letter omission, but one with serious implications.

In theory, this is a simple change, but at Cloudflare’s global scale, implementing it required careful planning and execution. We’d like to share the steps we took to make this transition.

Understanding the scope

In an ideal scenario, we could simply close all cleartext HTTP ports on our network. However, two key challenges prevent this. First, as shown in the Cloudflare Radar figure below, about 2-3% of requests from “likely human” clients to our global network are over plaintext HTTP. While modern browsers prominently warn users about insecure HTTP connections and offer features to silently upgrade to HTTPS, this protection doesn't extend to the broader ecosystem of connected devices. IoT devices with limited processing power, automated API clients, or legacy software stacks often lack such safeguards entirely. In fact, when filtering on plaintext HTTP traffic that is “likely automated”, the share rises to over 16%! We continue to see a wide variety of legacy clients accessing resources over plaintext connections. This trend is not confined to specific networks, but is observable globally.

Closing HTTP ports, like port 80, across our entire IP address space would block such clients entirely, causing a major disruption in services. While we plan to cautiously start by implementing the change on Cloudflare's API IP addresses, it’s not enough. Therefore, our goal is to ensure all of our customers’ API traffic benefits from this change as well.

^{Breakdown of HTTP and HTTPS for ‘human’ connections}

The second challenge relates to limitations posed by the longstanding BSD Sockets API at the server-side, which we have addressed using Tubular, a tool that inspects every connection terminated by a server and decides which application should receive it. Operators historically have faced a challenging dilemma: either listen to the same ports across many IP addresses using a single socket (scalable but inflexible), or maintain individual sockets for each IP address (flexible but unscalable). Luckily, Tubular has allowed us to resolve this using 'bindings', which decouples sockets from specific IP:port pairs. This creates efficient pathways for managing endpoints throughout our systems at scale, enabling us to handle both HTTP and HTTPS traffic intelligently without the traditional limitations of socket architecture.

Step 0, then, is about provisioning both IPv4 and IPv6 address space on our network that by default has all HTTP ports closed. Tubular enables us to configure and manage these IP addresses differently than others for our endpoints. Additionally, Addressing Agility and Topaz enable us to assign these addresses dynamically, and safely, for opted-in domains.

Moving from strategy to execution

In the past, our legacy stack would have made this transition challenging, but today’s Cloudflare possesses the appropriate tools to deliver a scalable solution, rather than addressing it on a domain-by-domain basis.

Using Tubular, we were able to bind our new set of anycast IP prefixes to our TLS-terminating proxies across the globe. To ensure that no plaintext HTTP traffic is served on these IP addresses, we extended our global iptables firewall configuration to reject any inbound packets on HTTP ports.

iptables -A INPUT -p tcp -d  --dport  -j REJECT 
--reject-with tcp-reset

iptables -A INPUT -p udp -d  --dport  -j REJECT 
--reject-with icmp-port-unreachable

As a result, any connections to these IP addresses on HTTP ports are filtered and rejected at the transport layer, eliminating the need for state management at the application layer by our web proxies.

The next logical step is to update the DNS assignments so that API traffic is routed over the correct IP addresses. In our case, we encoded a new DNS policy for API traffic for the HTTPS-only interface as a declarative Topaz program in our authoritative DNS server:

- name: https_only
 exclusive: true 
 config: |
    (config
      ([traffic_class "API"]
       [ipv4 (ipv4_address “192.0.2.1”)] # Example IPv4 address
       [ipv6 (ipv6_address “2001:DB8::1:1”)] # Example IPv6 address
       [t (ttl 300]))
  match: |
    (= query_domain_class traffic_class)
  response: |
    (response (list ipv4) (list ipv6) t)

The above policy encodes that for any DNS query targeting the ‘API traffic’ class, we return the respective HTTPS-only interface IP addresses. Topaz’s safety guarantees ensure exclusivity, preventing other DNS policies from inadvertently matching the same queries and misrouting plaintext HTTP expected domains to HTTPS-only IPs api.cloudflare.com is the first domain to be added to our HTTPS-only API traffic class, with other applicable endpoints to follow.

Opting-in your API endpoints

As we said above, we've started with api.cloudflare.com and our internal API endpoints to thoroughly monitor any side effects on our own systems before extending this feature to customer domains. We have deployed these changes gradually across all data centers, leveraging Topaz’s flexibility to target subsets of traffic, minimizing disruptions, and ensuring a smooth transition.

To monitor unencrypted connections for your domains, before blocking access using the feature, you can review the relevant analytics on the Cloudflare dashboard. Log in, select your account and domain, and navigate to the "Analytics & Logs" section. There, under the "Traffic Served Over SSL" subsection, you will find a breakdown of encrypted and unencrypted traffic for your site. That data can help provide a baseline for assessing the volume of plaintext HTTP connections for your site that will be blocked when you opt in. After opting in, you would expect no traffic for your site will be served over plaintext HTTP, and therefore that number should go down to zero.

Snapshot of ‘Traffic Served Over SSL’ section on Cloudflare dashboard

Towards the last quarter of 2025, we will provide customers the ability to opt in their domains using the dashboard or API (similar to enabling the Always Use HTTPS feature). Stay tuned!

Wrapping up

Starting today, any unencrypted connection to api.cloudflare.com will be completely rejected. Developers should not expect a 403 Forbidden response any longer for HTTP connections, as we will prevent the underlying connection to be established by closing the HTTP interface entirely. Only secure HTTPS connections will be allowed to be established.

We are also making updates to transition api.cloudflare.com away from its static IP addresses in the future. As part of that change, we will be discontinuing support for non-SNI legacy clients for Cloudflare API specifically — currently, an average of just 0.55% of TLS connections to the Cloudflare API do not include an SNI value. These non-SNI connections are initiated by a small number of accounts. We are committed to coordinating this transition and will work closely with the affected customers before implementing the change. This initiative aligns with our goal of enhancing the agility and reliability of our API endpoints.

Beyond the Cloudflare API use case, we're also exploring other areas where it's safe to close plaintext traffic ports. While the long tail of unencrypted traffic may persist for a while, it shouldn’t be forced on every site. In the meantime, a small step like this can allow us to have a big impact in helping make a better Internet, and we are working hard to reliably bring this feature to your domains. We believe security should be free for all!

The Pain That Is GitHub Actions

2025-03-20T03:37:31.000Z

Preview: Amazon S3 Tables and Lakehouse in DuckDB

2025-03-18T16:36:20.000Z

The Failure Rate of EBS

2025-03-18T14:24:04.000Z

The real failure rate of EBS

2025-03-18T14:24:04.000Z

Depot (YC W23) is hiring a founding developer marketer (EU/US remote)

2025-03-18T07:00:20.000Z

Rippling sues Deel over spying

2025-03-17T13:03:52.000Z

Scorpi – A Modern Hypervisor (For macOS)

2025-03-16T13:01:31.000Z

Mojo may be the biggest programming language advance in decades (2023)

2025-03-16T01:14:51.000Z

TinyKVM: Fast sandbox that runs on top of Varnish

2025-03-14T02:12:11.000Z

The Startup CTO's Handbook

2025-03-11T22:18:42.000Z

A 10x Faster TypeScript

2025-03-11T14:32:23.000Z

Sidekick: Local-first native macOS LLM app

2025-03-09T08:08:02.000Z

Posthog/.cursorrules

2025-03-09T03:30:52.000Z

US 'to cease all future military exercises in Europe'

2025-03-09T01:45:07.000Z

SpaceX teams up with Thiel's Palantir, Anduril on American Golden Dome

2025-03-08T23:06:44.000Z

The Lie That Facebook Sold You

2025-03-08T22:44:08.000Z

Deploy from local to production (self-hosted)

2025-03-08T18:54:17.000Z

Polars Cloud: The Distributed Cloud Architecture to Run Polars Anywhere

2025-03-07T20:57:46.000Z

Why Local-First Software Is the Future and Its Limitations

2025-03-07T13:12:30.000Z

Warewulf is a stateless and diskless container OS provisioning system

2025-03-06T18:45:47.000Z

Aider: Using Uv as an Installer

2025-03-06T03:18:06.000Z

Shadowveil is a stylish, tough single-player auto-battler

2025-03-05T12:00:24.000Z

One thing Shadowveil: Legend of the Five Rings does well is invoke terror. Not just the terror of an overwhelming mass of dark energy encroaching on your fortress, which is what the story suggests; more so, the terror of hoping your little computer-controlled fighters will do the smart thing and then being forced to watch, helpless, as they are consumed by algorithmic choices, bad luck, your strategies, or some combination of all three.

Shadowveil, the first video game based on the more than 30-year-old Legend of the Five Rings fantasy franchise, is a roguelite auto-battler. You pick your Crab Clan hero (berserker hammer-wielder or tactical support type), train up some soldiers, and assign all of them abilities, items, and buffs you earn as you go. When battle starts, you choose which hex to start your fighters on, double-check your load-outs, then click to start and watch what happens. You win and march on, or you lose and regroup at base camp, buying some upgrades with your last run's goods.

Shadowveil: Legend of the Five Rings launch trailer.

In my impressions after roughly seven hours of playing, Shadowveil could do more to soften its learning curve, but it presents a mostly satisfying mix of overwhelming odds and achievement. What's irksome now could get patched, and what's already there is intriguing, especially for the price.

Netflix drops trailer for the Russo brothers’ The Electric State

2025-03-03T19:21:43.000Z

Millie Bobby Brown and Chris Pratt star in the Netflix original film The Electric State.

Anthony and Joe Russo have their hands full these days with the Marvel films Avengers: Doomsday and Avengers: Secret War, slated for 2026 and 2027 releases, respectively. But we'll get a chance to see another, smaller film from the directors this month on Netflix: The Electric State, adapted from the graphic novel by Swedish artist/designer Simon Stålenhag.

Stålenhag's stunningly surreal neofuturistic art—featured in his narrative art books, 2014's Tales from the Loop and 2016's Things From the Flood—inspired the 2020 eight-episode series Tales From the Loop, in which residents of a rural town find themselves grappling with strange occurrences thanks to the presence of an underground particle accelerator. That adaptation captured the mood and tone of the art that inspired it and received Emmy nominations for cinematography and special visual effects.

The Electric State was Stålenhag's third such book, published in 2018 and set in a similar dystopian, ravaged landscape. Paragraphs of text, accompanied by larger artworks, tell the story of a teen girl named Michelle who must travel across the country with her robot companion to find her long-lost brother, while being pursued by a federal agent. The Russo brothers acquired the rights early on and initially intended to make the film with Universal, but when the studio decided it would not be giving the film a theatrical release, Netflix bought the distribution rights.

Yoke: Infrastructure as code, but actually

2025-03-02T13:56:01.000Z

Distributed Systems Programming Has Stalled

2025-02-28T04:57:17.000Z

Laravel Cloud

2025-02-24T15:26:54.000Z

Scrap Your ORM—Replace Your ORM With Relational Algebra

2025-02-22T18:40:40.000Z

The $1.5B Bybit Hack: The Era of Operational Security Failures Has Arrived

2025-02-22T17:05:27.000Z

Launch HN: Massdriver (YC W22) – Self-serve cloud infra without the red tape

2025-02-21T16:19:12.000Z

Running Systemd-Nspawn Containers

2025-02-21T08:00:54.000Z

Docker limits unauthenticated pulls to 10/HR/IP from Docker Hub, from March 1

2025-02-21T07:42:45.000Z

Docker limits unauthenticated pulls to 10/hr/ip from Docker Hub, from March 1

2025-02-21T07:40:41.000Z

Obsidian is now free for work

2025-02-20T16:50:18.000Z

Yaak 2.0 – Git, WebSockets, OAuth, and More

2025-02-19T20:52:53.000Z

Show HN: Mastra – Open-source TypeScript agent framework

2025-02-19T15:25:08.000Z

Show HN: Mastra – Open-source JS agent framework, by the creators of Gatsby

2025-02-19T15:25:08.000Z

Debugging Hetzner: Uncovering failures with powerstat, sensors, and dmidecode

2025-02-19T12:40:58.000Z

We were wrong about GPUs

2025-02-14T22:36:31.000Z

On Bloat

2025-02-14T07:12:41.000Z

Tolerating full cloud outages with Monzo Stand-in

2025-02-13T18:23:53.000Z

ElevenReader by ElevenLabs

2025-02-12T06:10:25.000Z

Bulk inserts on ClickHouse: How to avoid overstuffing your instance

2025-02-11T14:43:45.000Z

Meta's Hyperscale Infrastructure: Overview and Insights

2025-02-11T04:19:41.000Z

Nocc – A Distributed C++ Compiler

2025-02-11T00:35:34.000Z

Docker Bake Is Now Generally Available

2025-02-08T03:39:46.000Z

Docker Bake is now generally available

2025-02-08T03:39:46.000Z

Cloudflare R2 Global Outage

2025-02-06T08:27:37.000Z

Go Module Mirror served backdoor to devs for 3+ years

2025-02-05T12:25:55.000Z

A mirror proxy Google runs on behalf of developers of the Go programming language pushed a backdoored package for more than three years until Monday, after researchers who spotted the malicious code petitioned for it to be taken down twice.

The service, known as the Go Module Mirror, caches open source packages available on GitHub and elsewhere so that downloads are faster and to ensure they are compatible with the rest of the Go ecosystem. By default, when someone uses command-line tools built into Go to download or install packages, requests are routed through the service. A description on the site says the proxy is provided by the Go team and “run by Google.”

Caching in

Since November 2021, the Go Module Mirror has been hosting a backdoored version of a widely used module, security firm Socket said Monday. The file uses “typosquatting,” a technique that gives malicious files names similar to widely used legitimate ones and plants them in popular repositories. In the event someone makes a typo or even a minor variation from the correct name when fetching a file with the command line, they land on the malicious file instead of the one they wanted. (A similar typosquatting scheme is common with domain names, too.)

Using Terraform Workspace for AWS multi account archtetctures

2025-02-05T07:36:01.000Z

Deploying OpenVMS x86 on Amazon EC2

2025-02-05T05:48:06.000Z

Show HN: Gave Claude LSD SQL

2025-02-03T08:11:58.000Z

Apple is open sourcing Swift Build

2025-02-01T16:44:53.000Z

Setting up a Linux writecache as a RAM disk

2025-02-01T02:25:39.000Z

GitHub Is Down

2025-01-30T14:29:28.000Z

Why Durable Execution Should Be Lightweight

2025-01-30T14:19:12.000Z

A Major Postgres Upgrade with Zero Downtime

2025-01-29T16:57:59.000Z

Show HN: Mcp-Agent – Build effective agents with Model Context Protocol

2025-01-29T16:26:07.000Z

Go 1.24's go tool is one of the best additions to the ecosystem in years

2025-01-27T20:33:43.000Z

Cloud Virtualization: Red Hat, AWS Firecracker, and Ubicloud internals

2025-01-24T15:59:23.000Z

How we scaled Slack to support 1000s of developers

2025-01-24T07:10:25.000Z

Bun 1.2 Is Released

2025-01-23T06:50:28.000Z

Tailwind CSS v4.0

2025-01-23T03:24:04.000Z

Trae: An AI-powered IDE by ByteDance

2025-01-23T01:21:37.000Z

Tailwind CSS v4.0

2025-01-23T00:33:02.000Z

I'll think twice before using GitHub Actions again

2025-01-20T03:41:27.000Z

“The Traitors”, a reality TV show, offers a useful economics lesson

2025-01-19T08:56:35.000Z

Brood War Korean Translations

2025-01-17T17:13:27.000Z

Ask HN: Google forcibly enabled Gemini in our Corp Org. How to disable?

2025-01-17T15:25:29.000Z

Nintendo announces the Switch 2 [video]

2025-01-16T13:08:14.000Z

Nix - Death by a thousand cuts

2025-01-15T00:52:30.000Z

Amid a flurry of hype, Microsoft reorganizes entire dev team around AI

2025-01-14T21:00:07.000Z

Microsoft CEO Satya Nadella has announced a dramatic restructuring of the company's engineering organization, which is pivoting the company's focus to developing the tools that will underpin agentic AI.

Dubbed "CoreAI - Platform and Tools," the new division rolls the existing AI platform team and the previous developer division (responsible for everything from .NET to Visual Studio) along with some other teams into one big group.

As for what this group will be doing specifically, it's basically everything that's mission-critical to Microsoft in 2025, as Nadella tells it:

In the belly of the MrBeast

2025-01-14T13:05:31.000Z

Show HN: Simple Docker Hosting

2025-01-14T12:35:13.000Z

GitHub Git Operations Are Down

2025-01-13T23:47:31.000Z

Disco Elysium Explorer

2025-01-13T03:11:37.000Z

Distributed Transactions at Scale in Amazon DynamoDB

2025-01-12T11:12:29.000Z

Nix – Death by a Thousand Cuts

2025-01-11T16:19:56.000Z

Why aren't we all serverless yet?

2025-01-09T13:12:57.000Z

Year 7 as a CTO

2025-01-06T14:27:31.000Z

Engineer eats efficiently for $2.50 a day (2016)

2025-01-06T01:56:42.000Z

Reliable system call interception

2025-01-05T15:58:05.000Z

Yemeni Coffee Shops in Texas

2025-01-03T11:11:39.000Z

Erlang master classes

2025-01-02T15:51:21.000Z

Doesn’t link directly to video, but to collection of videos.

How I Use Claude

2025-01-02T05:35:06.000Z

Databases in 2024: A Year in Review

2025-01-01T00:32:11.864Z

Andy rises from the ashes of his dead startup and discusses what happened in 2024 in the database game.

Systems ideas that sound good but almost never work

2024-12-31T16:47:54.000Z

Is there such a thing as "private, interactive databases" for SaaS's

2024-12-30T11:49:03.000Z

Per Seat Pricing Sucks

2024-12-28T16:58:35.000Z

Permissionless. A Manifesto for the Future of Everything

2024-12-28T04:17:44.000Z

Ghostty: Reflecting on Reaching 1.0

2024-12-28T00:16:34.000Z

Ghostty 1.0

2024-12-26T20:14:57.000Z

Using AZs can eat up your budget – From Prometheus to VictoriaMetrics

2024-12-26T08:09:41.000Z

A Practitioner's Guide to Wide Events

2024-12-23T20:29:29.000Z

Fish has been ported to Rust

2024-12-22T16:20:23.000Z

The death of Glitch, the birth of Slack

2024-12-22T10:20:39.000Z

Introducing S2

2024-12-21T15:11:19.000Z

Show HN: Ephemeral VMs in 1 Microsecond

2024-12-20T10:43:36.000Z

Matt Mullenweg temporarily shuts down some Wordpress.org functions

2024-12-20T10:07:42.000Z

Fish shell announces 4.0 release

2024-12-19T14:36:36.000Z

Go Protobuf: The New Opaque API

2024-12-16T20:18:01.000Z

Rootly New Blog Series: RescueOps

2024-12-16T14:17:02.000Z

The Death of Developer Relations

2024-12-15T13:29:59.000Z

Show HN: Kubernetes Spec Explorer

2024-12-12T15:02:14.000Z

Ucacher: Speeding up GitHub Actions via syscall instrumentation

2024-12-11T17:44:11.000Z

Bazel 8.0 Released

2024-12-09T21:33:12.000Z

Now Boarding: The Story of Airport

2024-12-07T10:48:52.000Z

Mise: Dev tools, env vars, task runner

2024-12-07T07:21:40.000Z

Mistakes You're Going to Make as a New Manager

2024-12-06T16:56:50.000Z

The Acton Programming Language

2024-12-05T21:36:59.000Z

React v19 has been released

2024-12-05T18:50:16.000Z

React 19

2024-12-05T18:50:16.000Z

ChatGPT Pro

2024-12-05T18:09:31.000Z

Introducing Buy with AWS: an accelerated procurement experience on AWS Partner sites, powered by AWS Marketplace

2024-12-04T23:30:08.000Z

Today, we are announcing Buy with AWS, a new way to discover and purchase solutions available in AWS Marketplace from AWS Partner sites. You can use Buy with AWS to accelerate and streamline your product procurement process on websites outside of Amazon Web Services (AWS). This feature provides you the ability to find, try, and buy solutions from Partner websites using your AWS account

AWS Marketplace is a curated digital store for you to find, buy, deploy, and manage cloud solutions from Partners. Buy with AWS is another step towards AWS Marketplace making it easy for you to find and procure the right Partner solutions, when and where you need them. You can conveniently find and procure solutions in AWS Marketplace, through integrated AWS service consoles, and now on Partner websites.

Accelerate cloud solution discovery and evaluation

You can now discover solutions from Partners available for purchase through AWS Marketplace as you explore solutions on the web beyond AWS.

Look for products that are “Available in AWS Marketplace” when browsing on Partner sites, then accelerate your evaluation process with fast access to free trials, demo requests, and inquiries for custom pricing.

For example, I want to evaluate Wiz to see how it can help with my cloud security requirements. While browsing the Wiz website, I come across a page where I see “Connect Wiz with Amazon Web Services (AWS)”.

I choose Try with AWS. It asks me to sign in to my AWS account if I’m not signed in already. I’m then presented with a Wiz and AWS co-branded page for me to sign up for the free trial.

The discovery experience that you see will vary depending on type of the Partner website you’re shopping from. Wiz is an example of how Buy with AWS can be implemented by an independent software vendor (ISV). Now, let’s look at an example of an AWS Marketplace Channel Partner, or reseller, who operates a storefront of their own.

I browse to the Bytes storefront with product listings from AWS Marketplace. I have the option to filter and search from the curated product listings, which are available in AWS Marketplace, on the Bytes site.

I choose View Details for Fortinet and see an option to Request Private Offer from AWS.

As you can tell, on a Channel Partner site, you can browse curated product listings available in AWS Marketplace, filter products, and request custom pricing directly from their website.

Streamline product procurement on AWS Partner sites
I had a seamless experience using Buy with AWS to access a free trial for Wiz and browse through the Bytes storefront to request a private offer.

Now I want to try Databricks for one of the applications I’m building. I sign up for a Databricks trial through their website.

I chose Upgrade and see Databricks is available in AWS Marketplace, which gives me the option to Buy with AWS.

I choose Buy with AWS, and after I sign in to my AWS account, I land on a Databricks and AWS Marketplace co-branded procurement page.

I complete the purchase on the co-branded procurement page and continue to set up my Databricks account.

As you can tell, I didn’t have to navigate the challenge of managing procurement processes for multiple vendors. I also didn’t have to speak with a sales representative or onboard a new vendor in my billing system, which would have required multiple approvals and delayed the overall process.

Access centralized billing and benefits through AWS Marketplace
Because Buy with AWS purchases are transacted through and managed in AWS Marketplace, you also benefit from the post-purchase experience of AWS Marketplace, including consolidated AWS billing, centralized subscription management, and access to cost optimization tools.

For example, through the AWS Billing and Cost Management console, I can centrally manage all my AWS purchases, including Buy with AWS purchases, from one dashboard. I can easily access and process invoices for all of my organization’s AWS purchases. I also need to have valid AWS Identity and Access Management (IAM) permissions to manage subscriptions and make a purchase through AWS Marketplace.

AWS Marketplace not only simplifies my billing but also helps in maintaining governance over spending by helping me manage purchasing authority and subscription access for my organization with centralized visibility and controls. I can manage my budget with pricing flexibility, cost transparency, and AWS cost management tools.

Buy with AWS for Partners
Buy with AWS enables Partners who sell or resell products in AWS Marketplace to create new solution discovery and buying experiences for customers on their own websites. By adding call to action (CTA) buttons to their websites such as “Buy with AWS”, “Try free with AWS”, “Request private offer”, and “Request demo”, Partners can help accelerate product evaluation and the path-to-purchase for customers.

By integrating with AWS Marketplace APIs, Partners can display products from the AWS Marketplace catalog, allow customers to sort and filter products, and streamline private offers. Partners implementing Buy with AWS can access AWS Marketplace creative and messaging resources for guidance on building their own web experiences. Partners who implement Buy with AWS can access metrics for insights into engagement and conversion performance.

The Buy with AWS onboarding guide in the AWS Marketplace Management Portal details how Partners can get started.

Learn more
Visit the Buy with AWS page to learn more and explore Partner sites that offer Buy with AWS.

To learn more about selling or reselling products using Buy with AWS on your website, visit:

– Prasad

UnitedHealthcare CEO fatally shot in midtown Manhattan

2024-12-04T14:52:55.000Z

Drawbacks and solutions for the Meilisearch document indexer

2024-12-04T00:21:45.000Z

Amazon S3 Tables

2024-12-03T19:47:01.000Z

Amazon Aurora DSQL

2024-12-03T17:30:41.000Z

Introducing queryable object metadata for Amazon S3 buckets (preview)

2024-12-03T16:47:12.000Z

AWS customers make use of Amazon Simple Storage Service (Amazon S3) at an incredible scale, regularly creating individual buckets that contain billions or trillions of objects! At that scale, finding the objects which meet particular criteria — objects with keys that match a pattern, objects of a particular size, or objects with a specific tag — becomes challenging. Our customers have had to build systems that capture, store, and query for this information. These systems can become complex and hard to scale, and can fall out of sync with the actual state of the bucket and the objects within.

Rich Metadata
Today we are enabling in preview automatic generation of metadata that is captured when S3 objects are added or modified, and stored in fully managed Apache Iceberg tables. This allows you to use Iceberg-compatible tools such as Amazon Athena, Amazon Redshift, Amazon QuickSight, and Apache Spark to easily and efficiently query the metadata (and find the objects of interest) at any scale. As a result, you can quickly find the data that you need for your analytics, data processing, and AI training workloads.

For video inference responses stored in S3, Amazon Bedrock will annotate the content it generates with metadata that will allow you to identify the content as AI-generated, and to know which model was used to generate it.

The metadata schema contains over 20 elements including the bucket name, object key, creation/modification time, storage class, encryption status, tags, and user metadata. You can also store additional, application-specific descriptive information in a separate table and then join it with the metadata table as part of your query.

How it Works
You can enable capture of rich metadata for any of your S3 buckets by specifying the location (an S3 table bucket and a table name) where you want the metadata to be stored. Capture of updates (object creations, object deletions, and changes to object metadata) begins right away and will be stored in the table within minutes. Each update generates a new row in the table, with a record type (CREATE, UPDATE_METADATA, or DELETE) and a sequence number. You can retrieve the historical record for a given object by running a query that orders the results by sequence number.

Enabling and Querying Metadata
I start by creating a table bucket for my metadata using the create-table-bucket command (this can also be done from the AWS Management Console or with an API call):

$ aws s3tables create-table-bucket --name jbarr-table-bucket-1 --region us-east-2
--------------------------------------------------------------------------------
|                               CreateTableBucket                              |
+-----+------------------------------------------------------------------------+
|  arn|  arn:aws:s3tables:us-east-2:123456789012:bucket/jbarr-table-bucket-1   |
+-----+------------------------------------------------------------------------+

Then I specify the table bucket (by ARN) and the desired table name by putting this JSON into a file (I’ll call it config.json):

{
  "S3TablesDestination": {
    "TableBucketArn": "arn:aws:s3tables:us-east-2:123456789012:bucket/jbarr-table-bucket-1",
    "TableName": "jbarr_data_bucket_1_table"
  }
}

And then I attach this configuration to my data bucket (the one that I want to capture metadata for):

$ aws s3api create-bucket-metadata-table-configuration \
  --bucket jbarr-data-bucket-1 \
  --metadata-table-configuration file://./config.json \
  --region us-east-2

For testing purposes I installed Apache Spark on an EC2 instance and after a little bit of setup I was able to run queries by referencing the Amazon S3 Tables Catalog for Apache Iceberg package and adding the metadata table (as mytablebucket) to the command line:

$ bin/spark-shell \
--packages org.apache.iceberg:iceberg-spark-runtime-3.4_2.12:1.6.0 \
--jars ~/S3TablesCatalog.jar \
--master yarn \
--conf "spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions" \
--conf "spark.sql.catalog.mytablebucket=org.apache.iceberg.spark.SparkCatalog" \
--conf "spark.sql.catalog.mytablebucket.catalog-impl=com.amazon.s3tables.iceberg.S3TablesCatalog" \
--conf "spark.sql.catalog.mytablebucket.warehouse=arn:aws:s3tables:us-east-2:123456789012:bucket/jbarr-table-bucket-1"

Here is the current schema for the Iceberg table:

scala> spark.sql("describe table mytablebucket.aws_s3_metadata.jbarr_data_bucket_1_table").show(100,35)

+---------------------+------------------+-----------------------------------+
|             col_name|         data_type|                            comment|
+---------------------+------------------+-----------------------------------+
|               bucket|            string|   The general purpose bucket name.|
|                  key|            string|The object key name (or key) tha...|
|      sequence_number|            string|The sequence number, which is an...|
|          record_type|            string|The type of this record, one of ...|
|     record_timestamp|     timestamp_ntz|The timestamp that's associated ...|
|           version_id|            string|The object's version ID. When yo...|
|     is_delete_marker|           boolean|The object's delete marker statu...|
|                 size|            bigint|The object size in bytes, not in...|
|   last_modified_date|     timestamp_ntz|The object creation date or the ...|
|                e_tag|            string|The entity tag (ETag), which is ...|
|        storage_class|            string|The storage class that's used fo...|
|         is_multipart|           boolean|The object's upload type. If the...|
|    encryption_status|            string|The object's server-side encrypt...|
|is_bucket_key_enabled|           boolean|The object's S3 Bucket Key enabl...|
|          kms_key_arn|            string|The Amazon Resource Name (ARN) f...|
|   checksum_algorithm|            string|The algorithm that's used to cre...|
|          object_tags|map|The object tags that are associa...|
|        user_metadata|map|The user metadata that's associa...|
|            requester|            string|The AWS account ID of the reques...|
|    source_ip_address|            string|The source IP address of the req...|
|           request_id|            string|The request ID. For records that...|
+---------------------+------------------+-----------------------------------+

Here’s a simple query that shows some of the metadata for the ten most recent updates:

scala> spark.sql("SELECT key,size, storage_class,encryption_status \
  FROM mytablebucket.aws_s3_metadata.jbarr_data_bucket_1_table \
  order by last_modified_date DESC LIMIT 10").show(false)
+--------------------+------+-------------+-----------------+                   
|key                 |size  |storage_class|encryption_status|
+--------------------+------+-------------+-----------------+
|wnt_itco_2.png      |36923 |STANDARD     |SSE-S3           |
|wnt_itco_1.png      |37274 |STANDARD     |SSE-S3           |
|wnt_imp_new_1.png   |15361 |STANDARD     |SSE-S3           |
|wnt_imp_change_3.png|67639 |STANDARD     |SSE-S3           |
|wnt_imp_change_2.png|67639 |STANDARD     |SSE-S3           |
|wnt_imp_change_1.png|71182 |STANDARD     |SSE-S3           |
|wnt_email_top_4.png |135164|STANDARD     |SSE-S3           |
|wnt_email_top_2.png |117171|STANDARD     |SSE-S3           |
|wnt_email_top_3.png |55913 |STANDARD     |SSE-S3           |
|wnt_email_top_1.png |140937|STANDARD     |SSE-S3           |
+--------------------+------+-------------+-----------------+

In a real-world situation I would query the table using one of the AWS or open source analytics tools that I mentioned earlier.

Console Access
I can also set up and manage the metadata configuration for my buckets using the Amazon S3 Console by clicking the Metadata tab:

Available Now
Amazon S3 Metadata is available in preview now and you can start using it today in the US East (Ohio, N. Virginia) and US West (Oregon) AWS Regions.

Integration with AWS Glue Data Catalog is in preview, allowing you to query and visualize data—including S3 Metadata tables—using AWS Analytics services such as Amazon Athena, Amazon Redshift, Amazon EMR, and Amazon QuickSight.

Pricing is based on the number updates (object creations, object deletions, and changes to object metadata) with an additional charge for storage of the metadata table. For more pricing information, visit the S3 Pricing page.

I’m confident that you will be able to make use of this metadata in many powerful ways, and am looking forward to hearing about your use cases. Let me know what you think!

— Jeff;

The next platform

2024-12-03T02:52:29.000Z

Static IPs for Serverless Containers

2024-12-02T20:04:29.000Z

Amazon Q Business adds support to extract insights from visual elements within documents

Top announcements of AWS re:Invent 2024

2024-12-02T05:06:42.000Z

AWS re:Invent 2024, our flagship annual conference, is taking place Dec. 2-6, 2024, in Las Vegas. This premier cloud computing event brings together the global cloud computing community for a week of keynotes, technical sessions, product launches, and networking opportunities. As AWS continues to unveil its latest innovations and services throughout the conference, we’ll keep you updated here with all the major product announcements.

Additional re:Invent resources:

AWS News Blog: Chief Evangelist Jeff Barr and colleagues keep you posted on the biggest and best new AWS offerings.
What’s New with AWS: A comprehensive list of all AWS launches.
The Official AWS Podcast: A podcast for developers and IT professionals looking for the latest news and trends from AWS.
AWS On Air: Live-streamed announcements and hands-on demos.
AWS re:Post: Join the community in conversation through Q&A.

(This post was last updated: 8:26 a.m. PST, Dec. 3, 2024.)

Quick category links:

Analytics

AWS Clean Rooms now supports multiple clouds and data sources
With expanded data sources, AWS Clean Rooms helps customers securely collaborate with their partners’ data across clouds, eliminating data movement, safeguarding sensitive information, promoting data freshness, and streamlining cross-company insights.

Application Integration

Securely share AWS resources across VPC and account boundaries with PrivateLink, VPC Lattice, EventBridge, and Step Functions
Orchestrate hybrid workflows accessing private HTTPS endpoints – no more Lambda/SQS workarounds. EventBridge and Step Functions natively support private resources, simplifying cloud modernization.

Business Applications

Newly enhanced Amazon Connect adds generative AI, WhatsApp Business, and secure data collection
Use innovative tools like generative AI for segmentation and campaigns, WhatsApp Business, data privacy controls for chat, AI guardrails, conversational AI bot management, and enhanced analytics to elevate customer experiences securely and efficiently.

Compute

Amazon EC2 Trn2 Instances and Trn2 UltraServers for AI/ML training and inference are now available
With 4x faster speed, 4x more memory bandwidth, 3x higher memory capacity than predecessors, and 30% higher floating-point operations, these instances deliver unprecedented compute power for ML training and gen AI.

New Amazon EC2 P5en instances with NVIDIA H200 Tensor Core GPUs and EFAv3 networking
Amazon EC2 P5en instances deliver up to 3,200 Gbps network bandwidth with EFAv3 for accelerating deep learning, generative AI, and HPC workloads with unmatched efficiency.

Introducing storage optimized Amazon EC2 I8g instances powered by AWS Graviton4 processors and 3rd gen AWS Nitro SSDs
Elevate storage performance with AWS’s newest I8g instances, which deliver unparalleled speed and efficiency for I/O-intensive workloads.

Now available: Storage optimized Amazon EC2 I7ie instances
New AWS I7ie instances deliver unbeatable storage performance: up to 120TB NVMe, 40% better compute performance and up to 65% better real-time storage performance.

Containers

Use your on-premises infrastructure in Amazon EKS clusters with Amazon EKS Hybrid Nodes
Unify Kubernetes management across your cloud and on-premises environments with Amazon EKS Hybrid Nodes – use existing hardware while offloading control plane responsibilities to EKS for consistent operations.

Streamline Kubernetes cluster management with new Amazon EKS Auto Mode
With EKS Auto Mode, AWS simplifies Kubernetes cluster management, automating compute, storage, and networking, enabling higher agility and performance while reducing operational overhead.

Database

Amazon MemoryDB Multi-Region is now generally available
Build highly available, globally distributed apps with microsecond latencies across Regions, automatic conflict resolution, and up to 99.999% availability.

Generative AI / Machine Learning

New RAG evaluation and LLM-as-a-judge capabilities in Amazon Bedrock
Evaluate AI models and applications efficiently with Amazon Bedrock’s new LLM-as-a-judge capability for model evaluation and RAG evaluation for Knowledge Bases, offering a variety of quality and responsible AI metrics at scale.

Enhance your productivity with new extensions and integrations in Amazon Q Business
Seamlessly access AI assistance within work applications with Amazon Q Business’s new browser extensions and integrations.

New APIs in Amazon Bedrock to enhance RAG applications, now available
With custom connectors and reranking models, you can enhance RAG applications by enabling direct ingestion to knowledge bases without requiring a full sync, and improving response relevance through advanced reranking models.

Introducing new PartyRock capabilities and free daily usage
Unleash your creativity with PartyRock’s new AI capabilities: generate images, analyze visuals, search hundreds of thousands of apps, and process multiple docs simultaneously – no coding required.

Users can now query information embedded in various types of visuals, including diagrams, infographics, charts, and other image-based content.

Management & Governance

Container Insights with enhanced observability now available in Amazon ECS
With granular visibility into container workloads, CloudWatch Container Insights with enhanced observability for Amazon ECS enables proactive monitoring and faster troubleshooting, enhancing observability and improving application performance.

New Amazon CloudWatch Database Insights: Comprehensive database observability from fleets to instances
Monitor Amazon Aurora databases and gain comprehensive visibility into MySQL and PostgreSQL fleets and instances, analyze performance bottlenecks, track slow queries, set SLOs, and explore rich telemetry.

New Amazon CloudWatch and Amazon OpenSearch Service launch an integrated analytics experience
Unlock out-of-the-box OpenSearch dashboards and two additional query languages, OpenSearch SQL and PPL, for analyzing CloudWatch logs. OpenSearch customers can now analyze CloudWatch Logs without having to duplicate data.

Migration & Transfer Services

AWS Database Migration Service now automates time-intensive schema conversion tasks using generative AI
AWS DMS Schema Conversion converts up to 90% of your schema to accelerate your database migrations and reduce manual effort with the power of generative AI.

Announcing AWS Transfer Family web apps for fully managed Amazon S3 file transfers
AWS Transfer Family web apps are a new resource that you can use to create a simple interface for authorized line-of-business users to access data in Amazon S3 through a customizable web browser.

Introducing default data integrity protections for new objects in Amazon S3
Amazon S3 updates the default behavior of object upload requests with new data integrity protections that build upon S3’s existing durability posture.

Security, Identity, & Compliance

New AWS Security Incident Response helps organizations respond to and recover from security events
AWS introduces a new service to streamline security event response, providing automated triage, coordinated communication, and expert guidance to recover from cybersecurity threats.

Introducing Amazon GuardDuty Extended Threat Detection: AI/ML attack sequence identification for enhanced cloud security
AWS extends GuardDuty with AI/ML capabilities to detect complex attack sequences across workloads, applications, and data, correlating multiple security signals over time for proactive cloud security.

Simplify governance with declarative policies
With only a few steps, create declarative policies and enforce desired configuration for AWS services across your organization, reducing ongoing governance overhead and providing transparency for administrators and end users.

AWS Verified Access now supports secure access to resources over non-HTTP(S) protocols (preview)
With only a few steps, create declarative policies and enforce desired configuration for AWS services across your organization, reducing ongoing governance overhead and providing transparency for administrators and end users.

Introducing Amazon OpenSearch Service and Amazon Security Lake integration to simplify security analytics
Analyze security logs without data duplication; Amazon OpenSearch Service now offers zero-ETL integration with Amazon Security Lake for efficient threat hunting and investigations.

Storage

Announcing Amazon FSx Intelligent-Tiering, a new storage class for FSx for OpenZFS
Delivering NAS capabilities with automatic data tiering among frequently accessed, infrequent, and archival storage tiers, Amazon FSx Intelligent-Tiering offers high performance up to 400K IOPS, 20 GB/s throughput, seamless integration with AWS services.

New physical AWS Data Transfer Terminals let you upload to the cloud faster
Rapidly upload large datasets to AWS at blazing speeds with the new AWS Data Transfer Terminal, secure physical locations offering high throughput connection.

Connect users to data through your apps with Storage Browser for Amazon S3
Storage Browser for Amazon S3 is an open source interface component that you can add to your web applications to provide your authorized end users, such as customers, partners, and employees, with access to easily browse, upload, download, copy, and delete data in S3.

Container Insights with enhanced observability now available in Amazon ECS

2024-12-02T03:28:56.000Z

Last year, we announced enhanced observability in Amazon CloudWatch Container Insights, a new capability to improve your observability for Amazon Elastic Kubernetes Service (Amazon EKS). This capability helps you detect and fix container issues faster by providing detailed performance metrics and logs.

Expanding this capability, today we’re launching enhanced observability for your container workloads running on Amazon Elastic Container Service (Amazon ECS). This new capability will help reduce your mean time to detect (MTTD) and mean time to repair (MTTR) for your overall applications, helping prevent issues that could negatively impact your user experience.

Here’s a quick look at Container Insights with enhanced observability for Amazon ECS.

Container Insights with enhanced observability addresses a critical gap in container monitoring. Previously, correlating metrics with logs and events was a time-consuming process, often requiring manual searches and expertise in application architecture. Now, with this capability, CloudWatch and Amazon ECS automatically collect granular performance metrics such as CPU utilization at both the task and container levels while providing visual drill downs enabling easy root-cause analysis.

This new capability enables the following use cases:

Quickly identify root causes by viewing granular resource usage patterns and correlating telemetry data.
Proactively manage your ECS resources using curated dashboards based on AWS best practices.
Track your recent deployments and root causes of your deployment failures with the matching infrastructure anomalies enabling faster issue detection and quicker rollbacks when necessary.
Effortlessly monitor resources across multiple accounts without manual setup. Built-in cross-account support reduces operational overhead with single pane of glass observability.
Integration with other CloudWatch services such as Application Signals and CloudWatch Logs provides a seamless experience to correlate infrastructure with the services running and identify the impacted services.

Using container insights with enhanced observability for Amazon ECS
There are two ways to enable Container Insights with enhanced observability:

Cluster-level onboarding – You can enable it for specific clusters individually.
Account-level onboarding – You can also enable it at the account level, which automatically enables observability for all new clusters created in your account. This approach saves time and effort by eliminating the need to manually enable it for each new cluster.

To enable this feature at the account level, I navigate to the Amazon ECS console and select Account settings. Under the CloudWatch Container Insights observability section, I can see it’s currently disabled. I choose Update.

On this page, I find a new option called Container Insights with enhanced observability. I select this option and then choose Save changes.

If I need to enable this capability at the cluster level, I can do so when creating a new cluster.

I can also enable this capability for my existing clusters. To do so, I select Update cluster, and then choose the option.

Once enabled, I can see task-level metrics by navigating to the Metrics tab in my cluster overview console. To access health and performance metrics across my clusters, I can select View Container Insights, which will redirect me to the Container Insights page.

To get a big picture of all my workloads across different clusters, I can navigate to Amazon CloudWatch and then to Container Insights.

This view addresses the challenge of effectively monitoring clusters, services, tasks, and containers by providing a honeycomb visualization that offers an intuitive, high-level summary of cluster health. The dashboard employs a dual-state monitoring approach:

Alarm state (red or green) – Reflects customer-defined thresholds and alerts, allowing teams to configure monitoring based on their specific requirements
Utilization state (dark blue or light blue) – Uses CloudWatch built-in best practices to monitor resource usage patterns across containers. The darker blue indicates clusters operating under higher utilization, enabling teams to proactively identify potential resource constraints before they impact performance

Let’s say there’s an issue in one of my clusters. I can hover over the cluster to display all the alarms created under that cluster at different layers, from the cluster layer down to the container layer.

I also have the option to view all clusters in a list format. The list format is essential for cross-account observability, displaying account IDs and labels for cluster ownership. This helps DevOps engineers quickly identify and collaborate with account owners to resolve potential application issues.

Now, I’d like to explore further. I select my cluster link, which redirects me to the Container Insights detailed dashboard view. Here, I can see a spike in memory utilization for this cluster.

I can dive deeper into container-level details, which help me quickly identify which services are causing this issue.

Another useful feature I found is the Filters option, which helps me conduct more thorough investigations across containers, services, or tasks in this cluster.

If I need to delve deeper into the application logs to understand the root cause of this issue, I can select the task, choose Actions, and choose which logs I would like to view.

On top of using AWS X-Ray traces, I can investigate another two types of logs here. First, I can use performance logs—structured logs containing metric data—to drill down and identify container-level root causes. Second, I examine collected application or container logs . These logs give me detailed insights into application behavior within the container, helping me trace the sequence of events that led to any issues.

In this case, I use application logs.

This streamlines my journey to troubleshoot my application. In this case, the issue is on the downstream calls to third-party applications, which return timeouts.

This enhanced capability also works with Amazon CloudWatch Application Signals to automatically instrument my application. I can monitor current application health and track long-term application performance against service-level objectives.

I select the Application Signals tab.

This integration with Amazon CloudWatch Application Signals provides me with end-to-end visibility, helping me correlate container performance with end-user experience.

When I select datapoints in the graphs, I can see associated traces, which show me all correlated services and their impact. I can also access relevant logs to understand root causes.

Additional things to know
Here are a couple of important points to note:

Availability – Container Insights with enhanced observability for ECS is now available in all AWS Regions including the China Regions.
Pricing – Container Insights with enhanced observability for ECS comes with a flat metric pricing, visit the Amazon CloudWatch Pricing page.

Get started today and experience improved observability for your container workloads. Learn more on the Amazon CloudWatch documentation page.

Happy monitoring,
— Donnie Prakoso

Announcing Amazon FSx Intelligent-Tiering, a new storage class for FSx for OpenZFS

2024-12-02T03:22:42.000Z

When I speak to customers who are planning to migrate massive amounts of on-premises data to AWS, they tell me that they want to simplify their storage management, reduce their costs, and to make the data more accessible so that it can be used for analytics, machine learning training, genomics, and other use cases. Customers are already using Network Attached Storage (NAS) on-premises, and are looking for a cloud-based upgrade that offers similar capabilities including point-in-time snapshots, data clones, and user management.

AWS customers such as Amdocs, Vela Games, and Astera Labs have been running their mission-critical and performance-intensive NAS workloads like databases, game development and streaming, and semiconductor chip design on Amazon FSx for OpenZFS. They’ve been using the existing SSD storage class on FSx to provide the predictable, high performance these workloads need. However, many other customers have large data sets that are stored on HDD-based or hybrid SSD/HDD-based NAS storage on prem that find it cost-prohibitive to move their data sets to all-SSD storage. Additionally, these customers are finding it increasingly challenging and expensive to manage provisioned storage on prem for unpredictable data sets and avoid running out of space. And they are keeping their NAS data around for longer because it could have future value for building their next model, investment strategy, or product, but that means they need to spend more time and effort monitoring access patterns and moving data around between hot and cold storage media to optimize costs.

FSx Intelligent-Tiering
Taking all of this into account, I am happy to be able to tell you about the new Amazon FSx Intelligent-Tiering storage class, available today for use with Amazon FSx for OpenZFS file systems. The new storage class is priced 85% lower than the existing SSD storage class and 20% lower than traditional HDD-based deployments on premises, and brings full elasticity and intelligent tiering to NAS data sets.

Your data moves between three storage tiers (Frequent Access, Infrequent Access, and Archive) with no effort on your part, so you get automatic cost savings with no upfront costs or commitments. Here’s how the tiers work:

Frequent Access – Data that has been accessed within the last 30 days is stored in this tier.

Infrequent Access – Data that has been not been accessed for 30 to 90 days is stored in tier, at a 44% cost reduction from Frequent Access.

Archive – Data that has not been accessed for 90 or more days is stored in this tier, at a 65% cost reduction from Infrequent Access.

Regardless of the storage tier, your data is stored across multiple AWS Availability Zones (AZs) for redundancy and availability, and can be retrieved instantly in milliseconds.

There’s no need to manage or pre-provision storage, making this storage class a great fit for uses case such as genomics, financial data analytics, seismic imagery analysis, and machine learning where storage requirements can change dramatically over the course of days or weeks.

Along with the potential for cost savings, you get high performance: up to 400K IOPS and 20 GB/second of throughput for each OpenZFS file system, with a time-to-first-byte of tens of milliseconds for all data, regardless of storage class. You can also configure an SSD-based read cache (64 GiB to 512 TiB) to reduce the time-to-first-byte by 10x to 100x for cached data.

Creating a File System
I can create a file system using the AWS Management Console, CLI, API, or a AWS CloudFormation. From the Console I click Create file system to get started:

I choose Amazon FSx for OpenZFS and click Next:

Then I enter a name (jeff_fsx_openzfs_1) for my file system and select the Intelligent-Tiering storage class. I choose the desired Throughput capacity, and I select one of the three sizing mode options for the read cache, click Next, and confirm my choices in order to create my file system:

It is ready within minutes, and I can NFS mount it to my EC2 instance:

$ sudo mkdir /fsx_zfs
$ sudo mount -t nfs -o noatime,nfsvers=4.2,sync,nconnect=16,rsize=1048576,wsize=1048576 \
  fs-00fc74f020d1e6f4e.fsx.us-east-2.aws.internal:/fsx/ /fsx_zfs/

After I run a representative workload for a while I can look at the metrics and review the performance of my file system:

It appears that I have plenty of throughput, but my read cache may be larger than needed. I created it in Automatically Provisioned mode, which allocated 3200 GiB of cache. I can change that (and save some money) with a couple of clicks:

I can also change the throughput capacity as needed:

Amazon FSx NAS Features and Attributes
Let’s take a quick look at some of the features which make FSx for OpenZFS and the FSx Intelligent-Tiering storage class a great for for your NAS-level storage needs:

Built-in Backups – Amazon FSx automatically makes a daily backup of each file system during a specified backup window and retains them for a specified retention period. The backups are file-system consistent, highly durable, and incremental. You can also create backups on your own and retain them for as long as needed.

Point-In-Time Snapshots -You can create a read-only image of an OpenZFS volume at any time. The snapshots are stored within the file system and consume storage; they can be used to restore a volume, restore individual files and folders, or to create a new volume as either a clone or a full-copy.

Replication – You can replicate a point-in-time view of an OpenZFS volume to another volume across file systems, AWS Regions, and AWS accounts. FSx uses ZFS send/receive technology behind the scenes to perform this replication and automatically establishes and maintains network connectivity between file systems to handle interruptions and resume data transfer as needed.

Data Compression – You can enable ZSTD or LZ4 compression on your OpenZFS volumes to reduce storage cost and speed up data transfer.

User and Volume Quotas – You can limit the amount of storage consumed by an individual volume or user.

Things to Know
Here are a couple of things to keep in mind before we wrap up:

Regions – This new storage class is available in the US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Mumbai, Singapore, Sydney, Tokyo), Canada (Central), and Europe (Frankfurt, Ireland) AWS Regions.

Pricing – Pricing is based on the amount of primary storage consumed (GB/Month) and read cache provisioned (GB/Month). See the Amazon FSx for OpenZFS Pricing page for more information.

— Jeff;

Introducing default data integrity protections for new objects in Amazon S3

2024-12-01T20:46:36.000Z

Update on December 2, 2024: Updated SDKs with default integrity protections will be available in the coming weeks.

At Amazon Web Services (AWS), the vast majority of new capabilities are driven by your direct feedback. Two years ago, Jeff announced additional checksum algorithms and the optional client-side computation of checksums to make sure the objects stored on Amazon S3 are exactly what you sent. You told us you love this extra verification because it gives you confidence the object stored is the one you sent. You also told us you would prefer to have this extra verification enabled automatically, freeing you from developing additional code.

Starting today, we’re updating the Amazon Simple Storage Service (Amazon S3) default behavior when you upload objects. To build upon its existing durability posture, Amazon S3 now automatically verifies that your data is correctly transmitted over the network from your applications to your S3 bucket.

Amazon S3 is designed for 99.999999999% data durability (that’s 11 nines). Amazon S3 has always verified the integrity of object uploads by calculating checksums when objects reach our servers, before they are written to multiple storage devices. Once your data is stored in Amazon S3, it continually monitors data durability over time with periodic integrity checks of data at rest. Amazon S3 also actively monitors the redundancy of your data to help verify that your objects can tolerate the concurrent failure of multiple storage devices.

But data can still face integrity risks as it traverses the public internet before reaching our servers. Issues such as faulty hardware on networks we don’t manage or client software bugs could potentially corrupt or drop data before Amazon S3 has a chance to validate it. Previously, you could extend the integrity protection by providing your own precomputed checksums with your PutObject or UploadPart requests. However, this requires configuring tools and applications to generate and track checksums, which can be complex to implement consistently across all your client applications uploading objects to Amazon S3.

The new default behavior builds upon existing data integrity protections without requiring any changes to your applications. Additionally, the new checksums are stored in the object’s metadata, making them accessible for integrity verification at any time.

Automatic client-side integrity protection
Amazon S3 now extends data integrity protection all the way to client-side applications by default. The latest versions of our AWS SDKs automatically calculate a cyclic redundancy check (CRC)-based checksum for each upload and send it to Amazon S3. Amazon S3 independently calculates a checksum on the server side and validates it against the provided value before durably storing the object and its checksum in the object’s metadata.

When your client application doesn’t send a CRC checksum (maybe it uses an old version of our SDK or you haven’t updated your application custom code yet), Amazon S3 computes a CRC-based checksum anyway and stores it in the object metadata for future reference. You can compare at a later stage the stored CRC with a CRC computed on your side and verify the network transmission was correct.

This new capability provides you with an automatic checksum calculation and validation for new uploads from the latest versions of the AWS SDKs, the AWS Command Line Interface (AWS CLI), and the AWS Management Console. You can also verify the checksum stored in the object’s metadata at any time. The new default data integrity protections use the existing CRC32 and CRC32C algorithms or the new CRC64NVME algorithm. Amazon S3 also provides developers with consistent full-object checksums across single-part and multipart uploads.

When uploading files in multiple parts, the SDKs calculate checksums for each part. Amazon S3 uses these checksums to verify the integrity of each part through the UploadPart API. Additionally, S3 validates the entire file’s size and checksum when you call the CompleteMultipartUpload API.

The CreateMultiPartUpload API introduces a new HTTP header, x-amz-checksum-type, which lets you specify the type of checksum to use. You can choose either a full object checksum (calculated by combining the checksums of all individual parts) or a composite checksum.

The full object checksum is stored with the object metadata for future reference. This new protection works seamlessly with server-side encryption. The consistent behavior across uploads, multipart uploads, downloads, and encryption modes simplifies client-side integrity checks. The ability to use full-object checksums to validate integrity and store them for use later can help you streamline your applications.

Let’s see it in action
To start using this additional integrity protection, update to the latest version of the AWS SDK or AWS CLI. No code changes are required to enable the new integrity protections.

Case 1: Amazon S3 now attaches a checksum to objects on the server side when objects are uploaded without a checksum

I wrote a simple Python script to upload and download content to and from an S3 bucket. I enabled maximum logging verbosity to see the actual HTTP headers sent to and from Amazon S3.

import boto3
import logging

BUCKET_NAME="aws-news-blog-20241111"
CONTENT='Hello World!'
OBJECT_NAME='test.txt'

# Enable debug logging for boto3 and botocore to stdout (this is verbose !!!)
logging.basicConfig(level=logging.DEBUG)

# create a s3 client
client = boto3.client('s3')

# put an object
client.put_object(Bucket=BUCKET_NAME, Key=OBJECT_NAME, Body=CONTENT)

# get the object 
response = client.get_object(Bucket=BUCKET_NAME, Key=OBJECT_NAME)
print(response['Body'].read().decode('utf-8'))

In the first step of this demo, I use an old AWS SDK for Python that doesn’t compute the CRC checksum on the client side. Despite this, I can observe that Amazon S3 now responds with a checksum it computed upon receiving the object.

S3 RESPONSE:
{
    ...
    "x-amz-checksum-crc64nvme": "AuUcyF784aU=",
    "x-amz-checksum-type": "FULL_OBJECT",
    ...
}

Case 2: Upload with manually pre-computed CRC64NVME checksum, a new checksum type

When I don’t have the option to use the latest version of the AWS SDK, or when I use my own code to upload objects to S3 buckets, I can compute the checksum and send it in the PutObject API request. Here is how I compute the checksum on my content before sending it to Amazon S3. To keep this code short, I use the checksums package available in the new AWS SDK for Python.

from awscrt import checksums
import base64

checksum = checksums.crc64nvme("Hello World!")
checksum_bytes = checksum.to_bytes(8, byteorder='big')  # CRC64 is 8 bytes
checksum_base64 = base64.b64encode(checksum_bytes)
print(checksum_base64)

And when I run it, I see the CRC64NVME checksum is the same as the one returned by Amazon S3 in the previous step.

$ python crc.py
b'AuUcyF784aU='

I can provide this checksum as part of the PutObject API call.

response = s3.put_object(
    Bucket=BUCKET_NAME,
    Key=OBJECT_NAME,
    Body=b'Hello World!',
    ChecksumAlgorithm='CRC64NVME', 
    ChecksumCRC64NVME=checksum_base64
)

Case 3: The new SDKs compute the checksum on the client-side

Now, I run the upload and download script again. This time, I use the latest version of the AWS SDK for Python. I observe that the SDK now sends the CRC headers in the request. The response also contains the checksum. I can easily compare the versions in the request and in the response to make sure the object received is the one I sent.

REQUEST:
{
    ...
    "x-amz-checksum-crc64nvme": "AuUcyF784aU=",
    "x-amz-checksum-type": "FULL_OBJECT",
    ... 
}

At any time, I can request the object checksum to verify the integrity of my local copy using the HeadObject or GetObject APIs.

 get_response = s3.get_object(
        Bucket=BUCKET_NAME,
        Key=OBJECT_NAME,
        ChecksumMode='ENABLED'
    )

The response object contains the checksum in the HTTPHeaders field.

{
...
    "x-amz-checksum-crc64nvme": "AuUcyF784aU=",
    "x-amz-checksum-type": "FULL_OBJECT",
...
}

Case 4: Multi-part uploads with new CRC-based whole-object checksum

When uploading large objects using the CreateMultipartUpload, UploadPart, and CompleteMultipartUpload APIs, the latest version of the SDK will automatically compute the checksums for you.

If you want to validate the integrity of your data by using a known content checksum, you can pre-compute the CRC-based whole-object checksum for multi-part uploads to simplify your client side tooling. When using full object checksums for multi-part uploads, you no longer have to keep track of part level checksums as you upload objects.


# precomputed CRC64NVME checksum for the full object
full_object_crc64_nvme_checksum = 'Naz0uXkYBPM='

# start multipart upload
create_response = s3.create_multipart_upload(
            Bucket=BUCKET_NAME,
            Key=OBJECT_NAME,
            ChecksumAlgorithm='CRC64NVME',
            ChecksumType='FULL_OBJECT'
        )
upload_id = create_response['UploadId']

# Upload parts
uploaded_parts = []

# part 1
data_part_1 = b'0' * (5 * 1024 * 1024) # minimum part size
upload_part_response_1 = s3.upload_part(
    Body=data_part_1,
    Bucket=BUCKET_NAME,
    Key=OBJECT_NAME,
    PartNumber=1,
    UploadId=upload_id,
    ChecksumAlgorithm='CRC64NVME'
)
uploaded_parts.append({'PartNumber': 1, 'ETag': upload_part_response_1['ETag']})

# part 2
data_part_2 = b'0' * (5 * 1024 * 1024)
upload_part_response_2 = s3.upload_part(
    Body=data_part_2,
    Bucket=BUCKET_NAME,
    Key=OBJECT_NAME,
    PartNumber=2,
    UploadId=upload_id,
    ChecksumAlgorithm='CRC64NVME'
)
uploaded_parts.append({'PartNumber': 2, 'ETag': upload_part_response_2['ETag']})

# Complete the multipart upload with the FULL_OBJECT CRC64NVME checksum to validate the integrity of your entire object. 
complete_response = s3.complete_multipart_upload(
            Bucket=BUCKET_NAME,
            Key=OBJECT_NAME,
            UploadId=upload_id,
            ChecksumCRC64NVME=full_object_crc64_nvme_checksum,
            ChecksumType='FULL_OBJECT',
            MultipartUpload={'Parts': uploaded_parts}
        )
print(complete_response)

Things to know
For your existing objects, the checksum will be added when you copy them. We updated the CopyObject API so you can choose the desired checksum algorithm for the destination object.

This new client-side checksum calculation is implemented in the latest version of the AWS SDKs. When you use an old SDK or custom code that doesn’t pre-compute checksums, Amazon S3 computes the checksum on all new objects it receives and stores it in the object’s metadata, even for multipart uploads.

Pricing and availability
This extended checksum computation and storage is available in all AWS Regions at no additional cost.

Update your AWS SDK and AWS CLI today to automatically benefit from this additional integrity protection for data in transit.

To learn more about data integrity protection on Amazon S3, visit Checking object integrity in the Amazon S3 User Guide.

AWS Verified Access now supports secure access to resources over non-HTTP(S) protocols (in preview)

2024-12-01T20:40:48.000Z

AWS Verified Access provides secure access to your corporate applications and resources without a virtual private network (VPN). We launched Verified Access in preview at re:Invent 2 years ago as a way to provide secure, VPN-less access to corporate applications, enabling organizations to manage network access based on identity and device security instead of IP addresses, which increases control and security over application access.

Today, Verified Access is launching a preview of its secure, VPN-less access capabilities to non-HTTP(S) applications and resources, enabling zero trust access to corporate resources over protocols such as Secure Shell (SSH) and Remote Desktop Protocol (RDP).

Organizations increasingly require secure, remote access to internal resources such as databases, remote desktops, and Amazon Elastic Compute Cloud (Amazon EC2) instances. Traditional VPN solutions, although effective for network access, often grant broad privileges and don’t support granular access controls, which can expose infrastructure with sensitive data. Although some organizations use bastion hosts to mediate access, this approach can create complexity and policy inconsistencies across HTTP(S) and non-HTTP(S) applications. With the rise of zero trust architectures, these gaps highlight the need for a secure access solution that extends consistent access policies across all applications and resources.

Verified Access addresses these needs by providing zero trust access controls for your corporate applications and resources. By supporting protocols such as SSH, RDP, or Java Database Connectivity (JDBC) or Open Database Connectivity (ODBC), Verified Access simplifies your security operations. Now, you can establish uniform, context-aware access policies across your corporate applications and resources. Verified Access evaluates each access request in real time, making sure access is granted only to users who meet specific identity and device security requirements. Additionally, it eliminates the need for separate VPNs or bastion hosts, streamlining operations and reducing the risk of over-privileged access.

One of my favorite capabilities is onboarding a group of resources by specifying their IP Classless Inter-Domain Routing (CIDR) and ports, rather than onboarding one resource at a time. Verified Access automatically creates DNS records for each active resource within the specified CIDR range. This eliminates the need for manual DNS configuration and users can therefore connect to new resources instantly.

Using Verified Access for non-HTTPS access
Configuring Verified Access for non-HTTPS access isn’t very different from what exists today. You can read the blog post I wrote for the launch of the preview 2 years ago or the Get started with Verified Access tutorial to learn how to get started.

Verified Access proposes two new types of endpoint targets: a target for one single resource and a target for multiple resources.

With the network interface, load balancer, or RDS endpoint target you can provide access to an individual resource such as an Amazon Relational Database Service (Amazon RDS) instance or an arbitrary TCP application fronted by a Network Load Balancer or an elastic network interface. This type of target endpoint is defined by a combination of a target type (such as a load balancer or a network interface) and a range of TCP ports. Verified Access will provide a DNS name for each endpoint upon its creation. A Verified Access DNS name is assigned for each target. This is the name end users will use to securely access the resource.

With network CIDR endpoint target, the resources are defined using an IP CIDR and port range. Through this type of endpoint target, you can easily provision secure access to ephemeral resources such as EC2 instances over protocols such as SSH and RDP. This is done without having to perform any actions such as creating or deleting endpoint targets each time a resource is added or removed. As long as these resources are assigned an IP address from the defined CIDR, Verified Access provides a unique public DNS record for each active IP detected in the defined CIDR.

Here is a diagram of the setup for this demo.

Part 1: As a Verified Access administrator

As a Verified Access administrator, I create the Verified Access instance, trust provider, access group, endpoint, and access policies, allowing access by the end user to the SSH server.

For this demo, I configure a Verified Access network CIDR endpoint target. I select TCP as Protocol and Network CIDR as Endpoint type. I make sure the CIDR range is within the one of the VPC where my target resources are. I select the TCP Port ranges and the Subnets within the VPC.

This is a good moment to stretch your legs and refill your cup of coffee, it takes a few minutes to create the endpoint.

Once, the status is Active, I launch an EC2 instance in a private Amazon Virtual Private Cloud (Amazon VPC). I enable SSH and configure the instance’s security group to only access requests coming from the VPC. A few minutes later, I can see the instance IP has been detected and assigned a DNS name to connect to from the Verified Access client application.

I also have the option during the configuration to delegate my own DNS subdomain, such as secure.mycompany.com, and Verified Access will assign DNS names for the resources within that subdomain.

Create an access policy

At this stage, there is no policy defined on the Verified Access endpoint. It will deny every request by default.

On the Verified Access groups page, I select the Policy tab. Then I select the Modify Verified Access endpoint policy button to create an access policy.

I enter a policy allowing anybody who is authenticated and has an email address ending with @amazon.com. This is the email address I used for the user defined in AWS IAM Identity Center. Note that the name after context is the name I entered as Policy reference name when I created the Verified Access trust provider. The documentation page has the details of the policy syntax, the attributes, and the operators I can use.

permit(principal, action, resource)
when {
    context.awsnewsblog.user.email.address like "*@amazon.com"
};

After a few minutes, Verified Access updates the policy and becomes Active again.

Distribute the configuration to clients

The last task as a Verified Access administrator is to extract the JSON configuration file of the client applications.

I retrieve the client application configuration file with the AWS Command Line Interface (AWS CLI). As a system administrator, I’ll distribute this configuration to each client machine.

aws ec2 export-verified-access-instance-client-configuration \
     --verified-access-instance-id "vai-0dbf2c4c011083069"

{
    "Version": "1.0",
    "VerifiedAccessInstanceId": "vai-0dbf2c4c011083069",
    "Region": "us-east-1",
    "DeviceTrustProviders": [],
    "UserTrustProvider": {
        "Type": "iam-identity-center",
        "Scopes": "verified_access:application:connect",
        "Issuer": "https://identitycenter.amazonaws.com/ssoins-xxxx",
        "PkceEnabled": true
    },
    "OpenVpnConfigurations": [
        {
            "Config": "Y2...bWU=",
            "Routes": [
                {
                    "Cidr": "2600:1f10:4a02:8700::/57"
                }
            ]
        }
    ]
}

Now that I have a resource to connect to and the Verified Access infrastructure in place, let me show you the end user experience to access a network endpoint.

Part 2: As an end user

As the end user, I receive a link to download and install the Verified Access Connectivity Client application. We support Windows and macOS clients at the time of this writing.

I install the configuration file I received from my administrator. I use ClientConfig1.json as the file name and I copy the file to C:\ProgramData\Connectivity Client on Windows or /Library/Application\ Support/Connectivity\ Client on macOS.

This is the same configuration file for all users, and the system administrator might push the file to all client machines using an endpoint management tool.

I start the Connectivity Client application. I choose Sign in to start the authentication sequence.

The authentication opens my web browser on the authentication page of my identity provider. The exact screen and login sequence varies from one provider to the other. After I’m authenticated, the Connectivity Client creates the secure tunnel to access my resource, an EC2 instance for this demo.

Once the status is Connected, I can securely connect to the resource, using the DNS name provided by Verified Access. In a terminal application, I type the ssh command to start the connection.

For this demo, I configured a delegated DNS domain secure.mycompany.com for Verified Access. The DNS address I received for the EC2 instance is 10-0-1-199.awsnews.secure.mycompany.com.

$ ssh -i mykey.pem ec2-user@10-0-1-199.awsnews.secure.mycompany.com

   ,     #_
   ~\_  ####_        Amazon Linux 2023
  ~~  \_#####\
  ~~     \###|
  ~~       \#/ ___   https://aws.amazon.com/linux/amazon-linux-2023
   ~~       V~' '->
    ~~~         /
      ~~._.   _/
         _/ _/
       _/m/'
Last login: Sat Nov 17 20:17:46 2024 from 1.2.3.4

$

Availability and pricing
Verified Access is available as a public preview in 18 AWS Regions: US East (Ohio, N. Virginia), US West (N. California, Oregon), Asia Pacific (Jakarta, Mumbai, Seoul, Singapore, Sydney, Tokyo), Canada (Central), Europe (Frankfurt, Ireland, London, Milan, Stockholm), Israel (Tel Aviv), and South America (São Paulo).

You’re charged for each hour that your non-HTTP(S) Verified Access endpoint remains active and per connection. The first 100 connections per month on each Verified Access endpoint are free. For more information, refer to AWS Verified Access Pricing.

With Verified Access for HTTP(S) and non-HTTP(S) applications you can unify the access controls to your private applications and systems and apply zero trust policies uniformly to all applications, and SSH, RDP, and HTTP(S) resources. It reduces the complexity of your network infrastructure and helps you to implement zero-trust access to your applications and resources. Finally, it adapts to your growing infrastructure, automating DNS setup and supporting large-scale deployments without resource-specific registration.

Go, try Verified Access today, and share your feedback with the team!

Streamline Kubernetes cluster management with new Amazon EKS Auto Mode

2024-12-01T20:35:33.000Z

Today, we’re announcing the general availability of Amazon Elastic Kubernetes Service (Amazon EKS) Auto Mode, a new capability to streamline Kubernetes cluster management for compute, storage, and networking, from provisioning to on-going maintenance with a single click. You can achieve higher agility, performance, and cost-efficiency by eliminating the operational overhead of managing the cluster infrastructure required to run production-grade Kubernetes applications at scale on Amazon Web Services (AWS).

Customers choose Amazon EKS because they can use the open standards and portability of Kubernetes with the security, scalability, and availability of AWS cloud. While Kubernetes gives advanced customers deep controls over application operations, other customers find managing the components required for production-grade Kubernetes applications to be complex and labor-intensive.

With the EKS Auto Mode, you can automate cluster management without deep Kubernetes expertise, because it selects optimal compute instances, dynamically scales resources, continuously optimizes costs, manages core add-ons, patches operating systems, and integrates with AWS security services. AWS expands its operational responsibility in EKS Auto Mode compared to customer-managed infrastructure in your EKS clusters. In addition to the EKS control plane, AWS will configure, manage, and secure the AWS infrastructure in EKS clusters that your applications need to run.

You can now get started quickly, improve performance, and reduce overhead, enabling you to focus on building applications that drive innovation instead of on cluster management tasks. EKS Auto Mode also reduces the work required to acquire and run cost-efficient GPU-accelerated instances so that your generative AI workloads have the capacity they need when they need it.

Get started with Amazon EKS Auto Mode
To get started, go to the Amazon EKS console and start to create your EKS cluster. You’ll have two options, Quick configuration (with EKS Auto Mode) and Custom configuration.

After you choose quick configuration, enter your cluster name and Kubernetes version, IAM roles, VPC subnets. You can view configuration default values in EKS Auto Mode whether you can edit after the cluster is created.

EKS Auto Mode enables the following Kubernetes capabilities in your EKS cluster:

Compute auto scaling and management
Application load balancing management
Pod and service networking and network policies
Cluster DNS and GPU support
Block storage volume support

When you choose Create, your EKS cluster with Auto Mode will be deployed in minutes with a single click.

If you choose the custom configuration option, you can customize other aspects of your cluster. You can use EKS Auto Mode in this option too.

You can also create an EKS Auto Mode cluster using AWS Command Line Interface (AWS CLI), eksctl, and AWS CloudFormation. Run the following eksctl command to create a new EKS Auto Mode cluster with:

$ eksctl create cluster --name= --enable-auto-mode

To learn more, visit Create cluster with EKS Auto Mode in the Amazon EKS User Guide.

If you want to enable EKS Auto Mode for an existing EKS cluster, choose Manage in the EKS Auto Mode section of the Overview tab in the EKS cluster detail page.

Select the box next to Use EKS Auto Mode to enable the EKS Auto Mode. You can unselect the EKS Auto Mode that will be configured in the cluster. The default is to create both a system and a default node pool and a node class.

You can also migrate from Karpenter, EKS Managed Node Groups, and EKS Fargate to EKS Auto Mode. To learn more, visit Enable EKS Auto Mode on existing EKS clusters in the Amazon EKS User Guide.

To meet your workload requirements, you can configure specific aspects of your EKS Auto Mode clusters. While EKS Auto Mode manages most infrastructure components automatically, you can customize node networking settings, node compute resources, storage class settings, and application load balancing behaviors while maintaining the benefits of automated infrastructure management. To learn more, visit Change EKS Auto cluster settings in the Amazon EKS User Guide.

Now, you can deploy different types of workloads to Amazon EKS clusters running in EKS Auto Mode. We provide key workload patterns including sample applications, load-balanced web applications, stateful workloads using persistent storage, and workloads with specific node placement requirements. Each example includes complete manifests and step-by-step deployment instructions that you can use as templates for your own applications. To learn more, visit Run workloads in EKS Auto Mode clusters in the Amazon EKS User Guide.

Now available
Amazon EKS Auto Mode is now available in all commercial AWS Regions except China Regions where Amazon EKS is available. You can enable EKS Auto Mode in any EKS cluster running Kubernetes 1.29 and above with no upfront fees or commitments—you pay for the management of the compute resources provisioned, in addition to your regular EC2 costs. To learn more, visit Amazon EKS pricing page.

Please register for the online webinar: Simplifying Kubernetes operations with Amazon EKS Auto Mode on December 12, 2024 to learn more about how EKS Auto Mode can accelerate your time to deploy workloads to production and reduce the operational overheads of Kubernetes. To learn more, visit Automate cluster infrastructure with EKS Auto Mode in the Amazon EKS User Guide.

Give EKS Auto Mode a try in the Amazon EKS console and send feedback to AWS re:Post for EKS or through your usual AWS Support contacts.

— Channy

Building a distributed log using S3 (under 150 lines of Go)

2024-12-01T16:10:46.000Z

Kubernetes on Hetzner: cutting my infra bill by 75%

2024-12-01T15:43:40.000Z

Ask HN: What's the one feature you'd want in a GitHub productivity tool?

2024-12-01T10:21:39.000Z

Unlocking the Power of DynamoDB: A Developer's Journey

2024-12-01T01:05:39.000Z

Making AWS News stupid fast with smart caching

2024-11-29T17:52:48.000Z

Quick takes on the latest Cloudflare public incident write-up

2024-11-29T01:05:55.000Z

Amazon FSx for Lustre increases throughput to GPU instances by up to 12x

2024-11-27T22:13:37.000Z

Today, we are announcing support for Elastic Fabric Adapter (EFA) and NVIDIA GPUDirect Storage (GDS) on Amazon FSx for Lustre. EFA is a network interface for Amazon EC2 instances that makes it possible to run applications requiring high levels of inter-node communications at scale. GDS is a technology that creates a direct data path between local or remote storage and GPU memory. With these enhancements, Amazon FSx for Lustre with EFA/GDS support provides up to 12 times higher (up to 1200 Gbps) per-client throughput compared to the previous FSx for Lustre version.

You can use FSx for Lustre to build and run the most performance demanding applications, such as deep learning training, drug discovery, financial modeling, and autonomous vehicle development. As datasets grow and new technologies emerge, you can adopt increasingly powerful GPU and HPC instances such as Amazon EC2 P5, Trn1, and Hpc7a. Until now, when accessing FSx for Lustre file systems, the use of traditional TCP networking limited throughput to 100 Gbps for individual client instances. This adoption is driving the need for FSx for Lustre file systems to provide the performance necessary to optimally utilize the increasing network bandwidth of these cutting-edge EC2 instances when accessing large datasets.

With EFA and GDS support in FSx for Lustre, you can now achieve up to 1,200 Gbps throughput per client instance (twelve times more throughput than previously) when using P5 GPU instances and NVIDIA CUDA in your applications.

With this new capability, you can fully utilize the network bandwidth of the most powerful compute instances and accelerate your machine learning (ML) and HPC workloads. EFA enhances performance by bypassing the operating system and using the AWS Scalable Reliable Datagram (SRD) protocol to optimize data transfer. GDS further improves performance by enabling direct data transfer between the file system and GPU memory, bypassing the CPU and eliminating redundant memory copies.

Let’s see how this works in practice.

Creating an Amazon FSx for Lustre file system with EFA enabled
To get started, in the Amazon FSx console, I choose Create file system and then Amazon FSx for Lustre.

I enter a name for the file system. In the Deployment and storage type section, I select Persistent, SSD and the new with EFA enabled option. I select 1000 MB/s/TiB in the Throughput per unit of storage section. With these settings, I enter 4.8 TiB for Storage capacity, which is the minimum supported with these settings.

For networking, I use the default virtual private cloud (VPC) and an EFA-enabled security group. I leave all other options to their default values.

I review all the options and proceed to create the file system. After a few minutes, the file system is ready to be used.

Mounting an Amazon FSx for Lustre file system with EFA enabled from an Amazon EC2 instance
In the Amazon EC2 console, I choose Launch instance, enter a name for the instance, and select the Ubuntu Amazon Machine Image (AMI). For Instance type, I select trn1.32xlarge.

In Network settings, I edit the default settings and select the same subnet used by the FSx Lustre file system. In Firewall (security groups), I select three existing security groups: the EFA-enabled security group used by the FSx for Lustre file system, the default security group, and a security group that provides Secure Shell (SSH) access.

In Advanced network configuration, I select ENA and EFA as Interface type. Without this setting, the instance would use traditional TCP networking and the connection with the FSx for Lustre file system would still be limited to 100 Gbps in throughput.

To have more throughput, I can add more EFA network interfaces, depending on the instance type.

I launch the instance and, when the instance is ready, I connect using EC2 Instance Connect and follow the instructions for installing the Lustre client in the FSx for Lustre User Guide and configuring EFA clients.

Then, I follow the instructions for mounting an FSx for Lustre file system from an EC2 instance.

I create a folder to use as mount point:

sudo mkdir -p /fsx

I select the file system in the FSx console and lookup the DNS name and Mount name. Using these values, I mount the file system:

sudo mount -t lustre -o relatime,flock file_system_dns_name@tcp:/mountname /fsx

EFA is automatically used when you access an EFA-enabled file system from client instances that support EFA and are using Lustre version 2.15 or higher.

Things to know
EFA and GDS support is available today with no additional cost on new Amazon FSx for Lustre file systems in all AWS Regions where persistent 2 is offered. FSx for Lustre automatically uses EFA when customers access an EFA-enabled file system from client instances that support EFA, without requiring any additional configuration. For a list of EC2 client instances that support EFA, see supported instance types in the Amazon EC2 User Guide. This network specifications table describes network bandwidths and EFA support for instance types in the accelerated computing category.

To use EFA-enabled instances with FSx for Lustre file systems, you must use Lustre 2.15 clients on Ubuntu 22.04 with kernel 6.8 or higher.

Note that your client instances and your file systems must be located in the same subnet within your Amazon Virtual Private Cloud (Amazon VPC) connection.

GDS is automatically supported on EFA-enabled file systems. To use GDS with your FSx for Lustre file systems, you need the NVIDIA Compute Unified Device Architecture (CUDA) package, the open source NVIDIA driver, and the NVIDIA GPUDirect Storage Driver installed on your client instance. These packages come preinstalled on the AWS Deep Learning AMI. You can then use your CUDA-enabled application to use GPUDirect storage for data transfer between your file system and GPUs.

When planning your deployment, note that EFA-enabled file systems have larger minimum storage capacity increments than file systems that are not EFA-enabled. For instance, if you choose the 1,000 MB/s/TiB throughput tier, the minimum storage capacity for EFA-enabled file systems starts at 4.8 TiB as compared to 1.2TB for FSx for Lustre file systems not enabling EFA. If you’re looking to migrate your existing workloads, you can use AWS DataSync to move your data from an existing file system to a new one that supports EFA and GDS.

For maximum flexibility, FSx for Lustre maintains compatibility with both EFA and non-EFA workloads. When accessing an EFA-enabled file system, traffic from non-EFA client instances automatically flows over traditional TCP/IP networking using Elastic Network Adapter (ENA), allowing seamless access for all workloads without any additional configuration.

To learn more about EFA and GDS support on FSx for Lustre, including detailed setup instructions and best practices, visit the Amazon FSx for Lustre documentation. Get started today and experience the fastest storage performance available for your GPU instances in the cloud.

— Danilo

Update 11/27: post updated to reflect 12x throughput

London's 850-year-old food markets to close

2024-11-27T21:36:38.000Z

Comparing AWS S3 with Cloudflare R2: Price, Performance and User Experience

2024-11-27T15:26:52.000Z

Show HN: AutoPiP – Safari extension for automatic Picture-in-Picture mode

2024-11-27T10:36:00.000Z

I Didn't Need Kubernetes, and You Probably Don't Either

2024-11-27T02:26:19.000Z

Time-based snapshot copy for Amazon EBS

2024-11-26T22:31:36.000Z

You can now specify a desired completion duration (15 minutes to 48 hours) when you copy an Amazon Elastic Block Store (Amazon EBS) snapshot within or between AWS Regions and/or accounts. This will help you to meet time-based compliance and business requirements for critical workloads. For example:

Testing – Distribute fresh data on a timely basis as part of your Test Data Management (TDM) plan.

Development – Provide your developers with updated snapshot data on a regular and frequent basis.

Disaster Recovery – Ensure that critical snapshots are copied in order to meet a Recovery Point Objective (RPO).

Regardless of your use case, this new feature gives you consistent and predictable copies. This does not affect the performance or reliability of standard copies—you can choose the option and timing that works best for each situation.

Creating a Time-Based Snapshot Copy
I can create time-based snapshot copies from the AWS Management Console, CLI (copy-snapshot), or API (CopySnapshot). While working on this post I created two EBS volumes (100 GiB and 1 TiB), filled each one with files, and created snapshots:

To create a time-based snapshot, I select the source as usual and choose Copy snapshot from the Action menu. I enter a description for the copy, choose the us-east-1 AWS Region as the destination, select Enable time-based copy, and (because this is a time-critical snapshot), enter a 15 minute Completion duration:

When I click Copy snapshot, the request will be accepted (and the copy will become Pending) only if my account’s throughput quotas are not already exceeded due to the throughput consumed by other active copies that I am making to the destination region. If the account level throughput quota is already exceeded, the console will display an error.

I can click Launch copy duration calculator to get a better idea of the minimum achievable copy duration for the snapshot. I open the calculator, enter my account’s throughput limit, and choose an evaluation period:

The calculator then uses historical data collected over the course of previous snapshot copies to tell me the minimum achievable completion duration. In this example I copied 1,800,000 MiB in the last 24 hours; with time-based copy and my current account throughput quota of 2000 MiB/second I can copy this much data in 15 minutes.

While the copy is in progress, I can monitor progress using the console or by calling DescribeSnapshots and examining the progress field of the result. I can also use the following Amazon EventBridge events to take actions (if the copy operation crosses regions, the event is sent in the destination region):

copySnapshot – Sent after the copy operation completes.

copyMissedCompletionDuration – Sent if the copy is still pending when the deadline has passed.

Things to Know
And that’s just about all there is to it! Here’s what you need to know about time-based snapshot copies:

CloudWatch Metrics – The SnapshotCopyBytesTransferred metric is emitted in the destination region, and reflect the amount of data transferred between the source and destination region in bytes.

Duration – The duration can range from 15 minutes to 48 hours in 15 minute increments, and is specified on a per-copy basis.

Concurrency – If a snapshot is being copied and I initiate a second copy of the same snapshot to the same destination, the duration for the second one starts when the first one is completed.

Throughput – There is a default per-account limit of 2000 MiB/second between each source and destination pair. If you need additional throughput in order to meet your RPO you can request an increase via the AWS Support Center. Maximum per-snapshot throughput is 500 MiB/second and cannot be increased.

Pricing – Refer to the Amazon EBS Pricing page for complete pricing information.

Regions – Time-based snapshot copies are available in all AWS Regions.

— Jeff;

Warp terminal – no more login required

2024-11-26T17:14:50.000Z

Fly.io outage – resolved

2024-11-26T01:47:25.000Z

Amazon S3 Adds Put-If-Match (Compare-and-Swap)

2024-11-25T22:11:31.000Z

Keanu Reeves voices archvillain Shadow in Sonic 3 trailer

2024-11-25T17:36:35.000Z

Ben Schwartz voices the titular character in Sonic the Hedgehog 3.

Some lucky folks got a heads-up last week that a trailer for Sonic the Hedgehog 3 was about to drop when Paramount Pictures released a playable Sega Genesis gaming cartridge to select individuals. Hidden among a mini-game and character posters and accessible by entering a cheat code was the trailer release date. True to its word, Paramount just dropped the final trailer for the third film in the successful franchise. All our favorite characters are back, as well as a couple of new ones, most notably a new villain familiar to fans of the games: Shadow, voiced by Keanu Reeves.

(Spoilers for the first two films below.)

As previously reported, in the first film, Sonic (Ben Schwartz) teamed up with local town sheriff Tom Wachowski (James Marsden) to stop the sinister mad scientist Dr. Robotnik (Jim Carrey). Robotnik wanted to catch and experiment on the hedgehog, and if he could also frame Tom as a domestic terrorist, even better.

Model Context Protocol

2024-11-25T16:14:22.000Z

ClickHouse — fast, deduplicated reads

2024-11-25T12:04:16.000Z

GitHub removes 42 "basic functionality" features from their roadmap

2024-11-25T08:48:26.000Z

Deprecating outdated issues on the GitHub public roadmap

2024-11-25T08:48:26.000Z

Zero Disk Architecture

2024-11-24T15:07:19.000Z

How Tailscale's infra team stays small

2024-11-23T03:28:46.000Z

Introducing a new experience for AWS Systems Manager

2024-11-22T22:36:19.000Z

Today, I’m excited to introduce a new and improved version of AWS Systems Manager that brings a highly requested cross-account, and cross-Region experience for managing nodes at scale.

The new System Manager experience provides centralized visibility of all your managed nodes which include various infrastructure types, such as Amazon Elastic Compute Cloud (EC2) instances, containers, virtual machines on other cloud providers, on-premise servers, and edge Internet of Things (IoT) devices. They are referred to as “managed nodes” when they have the Systems Manager Agent (SSM Agent) installed and are connected to Systems Manager.

If an SSM Agent stops working on a node for whatever reason, then Systems Manager loses connection to it and that node is then referred to as an “unmanaged node.” With the new update, Systems Manager can also help you to easily discover and troubleshoot unmanaged nodes. You can run and even schedule an automated diagnosis that provides you with recommended runbooks that you can execute to fix any issues and reestablish connection so they become managed nodes again.

Systems Manager is also now integrated with Amazon Q Developer, the most capable generative AI–powered assistant for software development. You can ask questions about your managed nodes to Amazon Q Developer using natural language and it will provide you with rapid insights plus links straight to Systems Manager where you can perform actions or continue to explore further.

With this release, you can also use AWS Organizations, to allow a delegated administrator to centrally manage nodes across the organization thanks to the new integration with Systems Manager.

Let’s examine a quick example that helps to demonstrate some of these new capabilities.

Imagine a scenario where you are a cloud platform engineer leading a migration plan aiming to replace all nodes running Windows Server 2016 Datacenter in the organization. Let’s use the new Systems Manager experience to quickly gather information about all the nodes that needs to be included in our plan.

Step 1 – Asking Amazon Q Developer
The easiest starting point is using Amazon Q Developer to ask what you want to find using natural language. Using the AWS Console, I open the Amazon Q chatbot and type Find all of my managed nodes running Microsoft Windows Server 2016 Datacenter in my organization.

Amazon Q quickly comes back with an answer: it tells us that there are ten nodes that fit the criteria and provides a list with an overview of each one.

There is also a link that redirects to the new Explore nodes page in System Manager where we can learn more information. Let’s follow it.

Step 2 – Reviewing our infrastructure
The Explore nodes page provides a comprehensive overview of all managed nodes across your organization, with options to group and filter results for quick access. In this case, we can see that the results are already filtered by Operating system name providing us with a list of all the nodes that are running Microsoft Windows Server 2016 Datacenter.

This is a great start! We could just finish here by downloading the report and add those nodes to our migration plan, however, this page only shows you information about your managed nodes. Could it be that there are unmanaged nodes that need to included in our plan? Let’s find out.

Step 3 – Handling unmanaged nodes
Open the menu, and navigate to the Review node insights page. Here you can see a dashboard with widgets that provide insightful interactive charts that you can use to drill down and discover more information about your nodes or even take actions. For example, the Managed node types pie chart shows the types of managed nodes we have whereas the SSM Agent versions graph provides us with an overview of all the different versions of SSM Agent running on them. You can also customize this view by adding and replacing widgets.

We want to investigate any unmanaged nodes to make sure we don’t miss any that may need to be added to our migration plan. The Node summary widget clearly shows that there are two unmanaged nodes. This could mean that these nodes don’t have the SSM Agent installed in which case we will need to investigate them manually. However, it could also just mean there are issues with the SSM agent permissions or network connectivity preventing Systems Manager from managing these nodes and treating them like any other managed node. The new Systems Manager experience allows you easily troubleshoot and remediate SSM Agents issues so let’s attempt to do this now.

Start by selecting the piece of the chart displaying our unmanaged nodes. This pops up an option to initiate a comprehensive diagnosis of all our unmanaged nodes with only one click. Let’s run this.

The diagnosis reviews key configurations such as missing virtual private cloud (VPC) endpoints, misconfigured VPC DNS settings, and misconfigured instance security groups that may be preventing the SSM Agent from connecting to Systems Manager. After the scanning is complete, we can see that it displays two Misconfigured VPC endpoint findings. It also gives you a link that you can use to open a side panel containing a recommended runbook that you can execute to solve the issues as well as links to relevant documentation.

Choosing to execute the recommended runbook presents you with a detailed preview of the changes which include a thorough overview of the actions it’s going to take in addition to the input parameters used, a link to view a breakdown of the steps involved, and the target nodes for this execution.

Let’s choose to go ahead and select Execute. Keep in mind that this may incur costs, so make sure to review them before executing. You can keep an eye on progress on this page as it goes through the steps to attempt to fix the issues on each node.

Aha! After the remediation is complete, we can see that Systems Manager has found and corrected issues with the SSM Agent with two nodes. This means that Systems Manager is able to connect with the SSM Agent running in those nodes successfully making them “managed nodes.” We can verify this by returning to the Explore nodes page and noticing that the count of “unmanaged nodes” has been reduced to zero now.

Now that all of our nodes are managed, we’re ready to get a full list of all of those that need to be added to our migration plan.

Step 4 – Downloading a report
Back on the Explore nodes page we can see that the count for nodes running Microsoft Windows Server 2016 Datacenter has gone up from ten to twelve! That means that those previously unmanaged nodes that we fixed through the automated diagnosis are indeed running our target operating system.

This is exactly what we need so we choose to download a Report. You give it a file name, and then choose from a few options such as which columns to include. In this case, we choose to download a CSV file with a row containing the column names.

That’s it! We have our CSV with detailed information about the nodes that need upgrading across our entire infrastructure. And the best part? You can also use Systems Manager to automate the upgrade once you’re ready to go ahead with the migration.

Conclusion
Systems Manager is a critical tool for gaining visibility and control over your compute infrastructure and performing operational actions at scale. The new experience offers a centralized cross-account, cross-Region view of all your nodes in your AWS accounts, on-premises, and multicloud environments through a centralized dashboard, offering integration with Amazon Q Developer for natural language queries, and one-click SSM Agent troubleshooting. You can enable the new experience at no extra cost by navigating to the Systems Manager console and following the straightforward instructions.

To learn more, see the documentation for more detail about the new Systems Manager experience.

Check out this interactive demo for a full visual tour of this experience.

Rust for AWS Lambda, the Docker Way

2024-11-22T16:16:39.000Z

Amazon S3 now supports the ability to append data to an object

2024-11-22T04:46:39.000Z

Track performance of serverless applications built using AWS Lambda with Application Signals

2024-11-21T20:50:46.000Z

In November 2023, we announced Amazon CloudWatch Application Signals, an AWS built-in application performance monitoring (APM) solution, to solve the complexity associated with monitoring performance of distributed systems for applications hosted on Amazon EKS, Amazon ECS, and Amazon EC2. Application Signals automatically correlates telemetry across metrics, traces, and logs, to speed up troubleshooting and reduce application disruption. By providing an integrated experience for analyzing performance in the context of your applications, Application Signals gives you improved productivity focusing on the applications that support your most critical business functions.

Today we’re announcing the availability of Application Signals for AWS Lambda to eliminate the complexities of manual setup and performance issues required to assess application health for Lambda functions. With CloudWatch Application Signals for Lambda, you can now collect application golden metrics (the incoming and outgoing volume of requests, latency, faults, and errors).

AWS Lambda abstracts away the complexity of the underlying infrastructure, enabling you to focus on building your application without having to monitor server health. This allows you to shift your focus toward monitoring the performance and health of your applications, which is necessary to operate your applications at peak performance and availability. This requires deep visibility into performance insights such as volume of transactions, latency spikes, availability drops, and errors for your critical business operations and application programming interfaces (APIs).

Previously, you had to spend signiﬁcant time correlating disjointed logs, metrics, and traces across multiple tools to establish the root cause of anomalies, increasing mean time to recovery (MTTR) and operational costs. Additionally, building your own APM solutions with custom code or manual instrumentation using open source (OSS) libraries was time-consuming, complex, operationally expensive, and often resulted in increased cold start times and deployment challenges when managing large ﬂeets of Lambda functions. Now, you can use Application Signals to seamlessly monitor and troubleshoot health and performance issues in serverless applications, without requiring any manual instrumentation or code changes from your application developers.

How it works
Using the pre-built, standardized dashboards of Application Signals, you can identify the root cause of performance anomalies in just a few clicks by drilling down into performance metrics for critical business operations and APIs. This helps you visualize application topology which shows interactions between the function and its dependencies. In addition, you can define Service Level Objectives (SLOs) on your applications to monitor specific operations that matter most to you. An example of an SLO could be to set a goal that a webpage should render within 2000 ms 99.9 percent of the time in a rolling 28-day interval.

Application Signals auto-instruments your Lambda function using enhanced AWS Distro for OpenTelemetry (ADOT) libraries. This delivers better performance such as lower cold start latency,
memory consumption, and function invocation duration, so you can quickly monitor your applications.

I have an existing Lambda function appsignals1 and I will configure Application Signals in the Lambda Console to collect various telemetry on this application.

In the Configuration tab of the function I select Monitoring and operations tools to enable both the Application signals and the Lambda service traces.

I have an application myAppSignalsApp that has this Lambda function attached as a resource. I’ve defined an SLO for my application to monitor specific operations that matter most to me. I’ve defined a goal that states that the application executes within 10 ms 99.9 percent of the time in a rolling 1-day interval.

It can take 5-10 minutes for Application Signals to discover the function after it’s been invoked. As a result you’ll need to refresh the Services page before you can see the service.

Now I’m in the Services page and I can see a list of all my Lambda functions that have been discovered by Application Signals. Any telemetry that is emitted will be displayed here.

I can then visualize the complete application topology from the Service Map and quickly spot anomalies across my service’s individual operations and dependencies, using the newly collected metrics of volume of requests, latency, faults, and errors. To troubleshoot, I can click into any point in time for any application metric graph to discover correlated traces and logs related to that metric, to quickly identify if issues impacting end users are isolated to an individual task or deployment.

Available now
Amazon CloudWatch Application Signals for Lambda is now generally available and you can start using it today in all AWS Regions where Lambda and Application Signals are available. Today, Application Signals is available for Lambda functions that use Python and Node.js managed runtimes. We’ll continue to add support for other Lambda runtimes in near future.

To learn more, visit the AWS Lambda developer guide and Application Signals developer guide. You can submit your questions to AWS re:Post for Amazon CloudWatch, or through your usual AWS Support contacts.

– Veliswa.

FQL: A KV Query Language

2024-11-20T15:50:39.000Z

Using Erlang hot code updates

2024-11-19T20:29:03.000Z

Transit Can Now Track Underground Trains without GPS

2024-11-18T13:31:04.000Z

Source: Transit

Earlier this month, Transit, one of my favorite apps of all time, gained an impressive new feature: the app is now able to track your train and warn you when you are about to reach your destination even when your train is underground. Previously, Transit had to rely on GPS and cellular service to precisely locate your train on its route, which meant it couldn’t reliably function as soon as you entered a subway tunnel.

The way they have been able to achieve this is fascinating. Transit now utilizes the iPhone’s built-in accelerometer and analyzes its patterns to identify when the vehicle you boarded is in motion, and every time it reaches a station. The company’s account of the whole process is nothing short of impressive. The team spent a week riding buses and trains to collect data and proceeded to create an entirely new prediction model that is able to count down the underground stations that you will need to ride through to reach your destination. Transit says the model works completely offline and on-device.

I know I’m going to give this new feature a try as soon as I get a chance to ride the Paris Métro next week.

→ Source: blog.transitapp.com

Transactional Object Storage?

2024-11-17T13:20:33.000Z

Kyanos: eBPF-based network issue analysis tool

2024-11-16T05:34:57.000Z

Centrally managing root access for customers using AWS Organizations

2024-11-15T16:56:44.000Z

AWS Identity and Access Management (IAM) is launching a new capability allowing security teams to centrally manage root access for member accounts in AWS Organizations. You can now easily manage root credentials and perform highly privileged actions.

Managing root user credentials at scale
For a long time, Amazon Web Services (AWS) accounts were provisioned with highly privileged root user credentials, which had unrestricted access to the account. This root access, while powerful, also posed significant security risks. Each AWS account’s root user had to be secured by adding layers of protection like multi-factor authentication (MFA). Security teams were required to manage and secure these root credentials manually. The process involved rotating credentials periodically, storing them securely, and making sure that the credentials complied with security policies.

As our customers expanded their AWS environments, this manual approach became cumbersome and prone to error. For example, large enterprises operating hundreds or thousands of member accounts struggled to secure root access consistently across all accounts. The manual intervention not only added operational overhead but also created a lag in account provisioning, preventing full automation and increasing security risks. Root access, if not properly secured, could lead to account takeovers and unauthorized access to sensitive resources.

Furthermore, whenever specific root actions such as unlocking an Amazon Simple Storage Service (Amazon S3) bucket policy or an Amazon Simple Queue Service (Amazon SQS) resource policy were required, security teams had to retrieve and use root credentials, which only increased the attack surface. Even with rigorous monitoring and strong security policies, maintaining long-term root credentials opened doors to potential mismanagement, compliance risks, and manual errors.

Security teams began seeking a more automated, scalable solution. They needed a way to not only centralize the management of root credentials but also programmatically manage root access without needing long-term credentials in the first place.

Centrally manage root access
With the new ability to centrally manage root access, we address the longstanding challenge of managing root credentials across multiple accounts. This new capability introduces two essential capabilities: the central management of root credentials and root sessions. Together, they offer security teams a secure, scalable, and compliant way to manage root access across AWS Organizations member accounts.

Let’s first discuss the central management of root credentials. With this capability, you can now centrally manage and secure privileged root credentials across all accounts in AWS Organizations. Root credentials management allows you to:

Remove long-term root credentials – Security teams can now programmatically remove root user credentials from member accounts, confirming that no long-term privileged credentials are left vulnerable to misuse.
Prevent credential recovery – It not only removes the credentials but also prevents their recovery, safeguarding against any unintended or unauthorized root access in the future.
Provision secure-by-default accounts – Because you can now create member accounts without root credentials from the start, you no longer need to apply additional security measures like MFA after account provisioning. Accounts are secure by default, which drastically reduces security risks associated with long-term root access and helps simplify the entire provisioning process.
Help to stay compliant – Root credentials management allows security teams to demonstrate compliance by centrally discovering and monitoring the status of root credentials across all member accounts. This automated visibility confirms that no long-term root credentials exist, making it easier to meet security policies and regulatory requirements.

But how can we make sure it remains possible to perform selected root actions on the accounts? This is the second capability we launch today: root sessions. It offers a secure alternative to maintaining long-term root access. Instead of manually accessing root credentials whenever privileged actions are required, security teams can now gain short-term, task-scoped root access to member accounts. This capability makes sure that actions such as unlocking S3 bucket policies or SQS queue policies can be performed securely without the need for long-term root credentials.

Root sessions key benefits include:

Task-scoped root access – AWS enables short-term root access for specific actions, adhering to the best practices of least privilege. This limits the scope of what can be done and minimizes the duration of access, reducing potential risks.
Centralized management – You can now perform privileged root actions from a central account without needing to log in to each member account individually. This streamlines the process and reduces the operational burden on security teams, allowing them to focus on higher-level tasks.
Alignment with AWS best practices – By using short-term credentials, organizations align themselves with AWS security best practices, which emphasize the principle of least privilege and the use of short-term, temporary access where possible.

This new capability does not grant full root access. It provides temporary credentials for performing one of these five specific actions. The first three actions are possible with central management of root user credentials. The last two come when enabling root sessions.

Auditing root user credentials – Read-only access to review root user information
Re-enabling account recovery – Reactivating account recovery without root credentials
Deleting root user credentials – Removing console passwords, access keys, signing certificates, and MFA devices
Unlocking an S3 bucket policy – Editing or deleting an S3 bucket policy that denies all principals
Unlocking an SQS queue policy – Editing or deleting an Amazon SQS resource policy that denies all principals

How to obtain root credentials on a member account
In this demo, I show you how to prepare your management account, create a member account without root credentials, and obtain temporary root credentials to make one of the five authorized API call on the member account. I assume you have an organization already created.

First, I create a member account.

aws organizations create-account    \
     --email stormacq+rootaccountdemo@amazon.com \
     --account-name 'Root User Demo account'
{
    "CreateAccountStatus": {
        "Id": "car-695abd4ee1ca4b85a34e5dcdcd1b944f",
        "AccountName": "Root User Demo account",
        "State": "IN_PROGRESS",
        "RequestedTimestamp": "2024-09-04T20:04:09.960000+00:00"
    }
}

Then, I enable the two new capabilities on my management account. Don’t worry, these commands don’t alter the behavior of the accounts in any way other than enabling use of the new capability.

➜  aws organizations enable-aws-service-access \
        --service-principal iam.amazonaws.com

➜  aws iam enable-organizations-root-credentials-management
{
    "OrganizationId": "o-rlrup7z3ao",
    "EnabledFeatures": [
        "RootCredentialsManagement"
    ]
}

➜  aws iam enable-organizations-root-sessions
{
    "OrganizationId": "o-rlrup7z3ao",
    "EnabledFeatures": [
        "RootSessions",
        "RootCredentialsManagement"
    ]
}

Alternatively, I can also use the console on the management account. On the IAM page, under Access management, I select Account settings.

Now, I’m ready to make requests to obtain temporary root credentials. I have to pass one of the five managed IAM policies to scope down the credentials to a specific action.

➜  aws sts assume-root    \
       --region us-east-1 \
       --target-principal  \
       --task-policy-arn arn=arn:aws:iam::aws:policy/root-task/S3UnlockBucketPolicy 

{
    "Credentials": {
        "AccessKeyId": "AS....XIG",
        "SecretAccessKey": "ao...QxG",
        "SessionToken": "IQ...SS",
        "Expiration": "2024-09-23T17:44:50+00:00"
    }
}

Once I obtain the access key ID, the secret access key, and the session token, I use them as usual with the AWS Command Line Interface (AWS CLI) or an AWS SDKs.

For example, I can pass these three values as environment variables.

$ export AWS_ACCESS_KEY_ID=ASIA356SJWJITG32xxx
$ export AWS_SECRET_ACCESS_KEY=JFZzOAWWLocoq2of5Exxx
$ export AWS_SESSION_TOKEN=IQoJb3JpZ2luX2VjEMb//////////wEaCXVxxxx

Now that I received the temporary credentials, I can make a restricted API call as root on the member account. First, I verify I now have root credentials. The Arn field confirms I’m working with the root user.


# Call get Caller Identity and observe I'm root in the member account
$ aws sts get-caller-identity
{
   "UserId": "012345678901",
   "Account": "012345678901",
   "Arn": "arn:aws:iam::012345678901:root"
}

Then, I use the delete-bucket-policy from S3 to remove an incorrect policy that has been applied to a bucket. The invalid policy removed all bucket access for everybody. Removing such policy requires root credentials.

aws s3api delete-bucket-policy --bucket my_bucket_with_incorrect_policy

When there is no output, it means the operation is successful. I can now apply a correct access policy to this bucket.

Credentials are valid only for 15 minutes. I wrote a short shell script to automate the process of getting the credentials as JSON, exporting the correct environment variables, and issuing the command I want to run as root.

Availability
Central management of root access is available at no additional cost in all AWS Regions except AWS GovCloud (US) and AWS China Regions, where there is no root user. Root sessions are available everywhere.

You can start using it through the IAM console, AWS CLI or AWS SDK. For more information, visit AWS account root user in our documentation and follow best practices for securing your AWS accounts.

AWS Lambda PR/FAQ After 10 Years

2024-11-15T16:16:05.000Z

Building Observability with ClickHouse

2024-11-15T13:35:03.000Z

Red Hat to contribute container tech (Podman, bootc, ComposeFS...) to CNCF

2024-11-14T17:59:27.000Z

Netflix's Distributed Counter Abstraction

2024-11-13T19:31:11.000Z

Kino 1.2 Adds Camera Control Support and Higher Resolution and Frame Rate Recording

2024-11-13T17:22:42.000Z

Source: Lux Camera.

Lux Camera’s video camera app Kino has been updated to version 1.2, bringing a variety of new features and a redesigned icon. I covered the debut of Kino back in May and have been using it a lot lately because its design makes taking great-looking video so easy.

At the heart of Kino’s 1.2 update is support for the latest iPhones. Kino now works with the iPhone 16 and 16 Pro’s Camera Control for making adjustments that previously were only possible by touching your iPhone’s screen.

Kino Instant Grades. Source: Lux Camera.

On the 16 Pro, the app also supports 4K video at 120fps with its Instant Grade feature enabled. That’s the feature that lets you pick a color grading preset created by video experts, including Stu Maschwitz, Sandwich Video, Evan Schneider, Tyler Stalman, and Kevin Ong. Version 1.2 of Kino lets you reorder those grades in its settings to make your favorites easier to access. Finally, Kino has added support for the following languages: Chinese, Dutch, French, German, Italian, Japanese, Korean, Portuguese, Russian, and Spanish.

If you haven’t tried Kino before, it’s available on the App Store for $9.99.

Access Extra Content and Perks

Founded in 2015, Club MacStories has delivered exclusive content every week for nearly a decade.

What started with weekly and monthly email newsletters has blossomed into a family of memberships designed every MacStories fan.

Club MacStories: Weekly and monthly newsletters via email and the web that are brimming with apps, tips, automation workflows, longform writing, early access to the MacStories Unwind podcast, periodic giveaways, and more;

Club MacStories+: Everything that Club MacStories offers, plus an active Discord community, advanced search and custom RSS features for exploring the Club’s entire back catalog, bonus columns, and dozens of app discounts;

Club Premier: All of the above and AppStories+, an extended version of our flagship podcast that’s delivered early, ad-free, and in high-bitrate audio.

Learn more here and from our Club FAQs.

Join Now

SST: Container Support

2024-11-11T09:51:32.000Z

Mergiraf: a syntax-aware merge driver for Git

2024-11-09T11:06:10.000Z

Multiple new macOS sandbox escape vulnerabilities

2024-11-08T06:10:14.000Z

There's Almost No Gitlab

2024-11-08T06:04:00.000Z

Edge Scripting: Build and run applications at the edge

2024-11-07T18:17:20.000Z

Accelerating the Performance of Rosetta in Linux VMs on Apple Silicon

2024-11-07T08:02:21.000Z

Dstack: An alternative to K8 for AI/ML tasks

2024-11-05T16:56:31.000Z

We're Leaving Kubernetes

2024-11-04T14:41:28.000Z

Ractor – a Rust Actor Framework

2024-11-03T01:47:23.000Z

Analyzing Go Build Times (2023)

2024-11-01T14:22:53.000Z

Show HN: Holos – Configure Helm and Kustomize Holistically with Cue

2024-10-29T12:58:16.000Z

NixOS Is Not Reproducible

2024-10-26T07:19:07.000Z

Fallout: London is a huge Fallout 4 mod that is now playable—and worth playing

2024-10-25T11:30:05.000Z

It took a crew of more than 100 talented modders and another hundred voice actors nearly five years to make Fallout: London. Just as they planned to release it, Bethesda came out with a "next-gen upgrade" of the mod's base game, Fallout 4, forcing the team to scramble and ultimately find a way to downgrade the game. When they finally released London, they then dealt with game-stopping bugs and quality issues that their small QA team could not have caught. It's been a long, maybe even post-apocalyptic road for these modders.

A few updates later, Fallout: London is in much better shape. I've been able to put about 12 hours into it, and that, in itself, is essentially my review: it is worth that kind of time and more. If you can still enjoy Fallout 4, of course.

Any Fallout fan waits a long time between official releases, so it can be tempting to go easy on any new offering, however spit-and-bailing-wire it may seem. But Fallout: London is a game in its own right, with a distinct look, vision, and stories to tell. You can find evidence of its unofficial mod-ness if you look around, but you're better off doing the Fallout thing: wandering, wondering, fighting, and occasionally talking to some messed-up weirdo.

OpenFeature – a vendor-agnostic, community-driven API for feature flagging

2024-10-25T01:26:45.000Z

Rsbuild – A Better Vite?

2024-10-25T01:12:00.000Z

Show HN: Rust based AWS Lambda Logs Viewer (TUI)

2024-10-25T00:53:59.000Z

Zigler: Zig NIFs in Elixir

2024-10-24T17:53:26.000Z

Developing with Docker

2024-10-24T14:17:23.000Z

Build durable applications on Cloudflare Workers: you write the Workflows, we take care of the rest

2024-10-24T13:00:00.000Z

Workflows, Cloudflare’s durable execution engine that allows you to build reliable, repeatable multi-step applications that scale for you, is now in open beta. Any developer with a free or paid Workers plan can build and deploy a Workflow right now: no waitlist, no sign-up form, no fake line around-the-block.

If you learn by doing, you can create your first Workflow via a single command (or visit the docs for the full guide):

npm create cloudflare@latest workflows-starter -- \
  --template "cloudflare/workflows-starter"

Open the src/index.ts file, poke around, start extending it, and deploy it with a quick wrangler deploy.

If you want to learn more about how Workflows works, how you can use it to build applications, and how we built it, read on.

Workflows? Durable Execution?

Workflows—which we announced back during Developer Week earlier this year—is our take on the concept of “Durable Execution”: the ability to build and execute applications that are durable in the face of errors, network issues, upstream API outages, rate limits, and (most importantly) infrastructure failure.

As over 2.4 million developers continue to build applications on top of Cloudflare Workers, R2, and Workers AI, we’ve noticed more developers building multi-step applications and workflows that process user data, transform unstructured data into structured, export metrics, persist state as they progress, and automatically retry & restart. But writing any non-trivial application and making it durable in the face of failure is hard: this is where Workflows comes in. Workflows manages the retries, emitting the metrics, and durably storing the state (without you having to stand up your own database) as the Workflow progresses.

What makes Workflows different from other takes on “Durable Execution” is that we manage the underlying compute and storage infrastructure for you. You’re not left managing a compute cluster and hoping it scales both up (on a Monday morning) and down (during quieter periods) to manage costs, or ensuring that you have compute running in the right locations. Workflows is built on Cloudflare Workers — our job is to run your code and operate the infrastructure for you.

As an example of how Workflows can help you build durable applications, assume you want to post-process file uploads from your users that were uploaded to an R2 bucket directly via a pre-signed URL. That post-processing could involve multiple actions: text extraction via a Workers AI model, calls to a third-party API to validate data, updating or querying rows in a database once the file has been processed… the list goes on.

But what each of these actions has in common is that it could fail. Maybe that upstream API is unavailable, maybe you get rate-limited, maybe your database is down. Having to write extensive retry logic around each action, manage backoffs, and (importantly) ensure your application doesn’t have to start from scratch when a later step fails is more boilerplate to write and more code to test and debug.

What’s a step, you ask? The core building block of every Workflow is the step: an individually retriable component of your application that can optionally emit state. That state is then persisted, even if subsequent steps were to fail. This means that your application doesn’t have to restart, allowing it to not only recover more quickly from failure scenarios, but it can also avoid doing redundant work. You don’t want your application hammering an expensive third-party API (or getting you rate limited) because it’s naively retrying an API call that you don’t have to.

export class MyWorkflow extends WorkflowEntrypoint {
	async run(event: WorkflowEvent, step: WorkflowStep) {
		const files = await step.do('my first step', async () => {
			return {
				inputParams: event,
				files: [
					'doc_7392_rev3.pdf',
					'report_x29_final.pdf',
					'memo_2024_05_12.pdf',
					'file_089_update.pdf',
					'proj_alpha_v2.pdf',
					'data_analysis_q2.pdf',
					'notes_meeting_52.pdf',
					'summary_fy24_draft.pdf',
				],
			};
		});

		// Other steps...
	}
}

Notably, a Workflow can have hundreds of steps: one of the Rules of Workflows is to encapsulate every API call or stateful action within your application into its own step. Each step can also define its own retry strategy, automatically backing off, adding a delay and/or (eventually) giving up after a set number of attempts.

await step.do(
	'make a call to write that could maybe, just might, fail',
	// Define a retry strategy
	{
		retries: {
			limit: 5,
			delay: '5 seconds',
			backoff: 'exponential',
		},
		timeout: '15 minutes',
	},
	async () => {
		// Do stuff here, with access to the state from our previous steps
		if (Math.random() > 0.5) {
			throw new Error('API call to $STORAGE_SYSTEM failed');
		}
	},
);

To illustrate this further, imagine you have an application that reads text files from an R2 storage bucket, pre-processes the text into chunks, generates text embeddings using Workers AI, and then inserts those into a vector database (like Vectorize) for semantic search.

In the Workflows programming model, each of those is a discrete step, and each can emit state. For example, each of the four actions below can be a discrete step.do call in a Workflow:

Reading the files from storage and emitting the list of filenames
Chunking the text and emitting the results
Generating text embeddings
Upserting them into Vectorize and capturing the result of a test query

You can also start to imagine that some steps, such as chunking text or generating text embeddings, can be broken down into even more steps — a step per file that we chunk, or a step per API call to our text embedding model, so that our application is even more resilient to failure.

Steps can be created programmatically or conditionally based on input, allowing you to dynamically create steps based on the number of inputs your application needs to process. You do not need to define all steps ahead of time, and each instance of a Workflow may choose to conditionally create steps on the fly.

Building Cloudflare on Cloudflare

As the Cloudflare Developer platform continues to grow, almost all of our own products are built on top of it. Workflows is yet another example of how we built a new product from scratch using nothing but Workers and its vast catalog of features and APIs. This section of the blog has two goals: to explain how we built it, and to demonstrate that anyone can create a complex application or platform with demanding requirements and multiple architectural layers on our stack, too.

If you’re wondering how Workflows manages to make durable execution easy, how it persists state, and how it automatically scales: it’s because we built it on Cloudflare Workers, including the brand-new zero-latency SQLite storage we recently introduced to Durable Objects.

To understand how Workflows uses Workers & Durable Objects, here’s the high-level overview of our architecture:

There are three main blocks in this diagram:

The user-facing APIs are where the user interacts with the platform, creating and deploying new workflows or instances, controlling them, and accessing their state and activity logs. These operations can be executed through our public API gateway using REST calls, a Worker script using bindings, Wrangler (Cloudflare's developer platform command line tool), or via the Dashboard user interface.

The managed platform holds the internal configuration APIs running on a Worker implementing a catalog of REST endpoints, the binding shim, which is supported by another dedicated Worker, every account controller, and their correspondent workflow engines, all powered by SQLite-backed Durable Objects. This is where all the magic happens and what we are sharing more details about in this technical blog.

Finally, there are the workflow instances, essentially independent clones of the workflow application. Instances are user account-owned and have a one-to-one relationship with a managed engine that powers them. You can run as many instances and engines as you want concurrently.

Let's get into more detail…

Configuration API and Binding Shim

The Configuration API and the Binding Shim are two stateless Workers; one receives REST API calls from clients calling our API Gateway directly, using Wrangler, or navigating the Dashboard UI, and the other is the endpoint for the Workflows binding, an efficient and authenticated interface to interact with the Cloudflare Developer Platform resources from a Workers script.

The configuration API worker uses HonoJS and Zod to implement the REST endpoints, which are declared in an OpenAPI schema and exported to our API Gateway, thus adding our methods to the Cloudflare API catalog.

import { swaggerUI } from '@hono/swagger-ui';
import { createRoute, OpenAPIHono, z } from '@hono/zod-openapi';
import { Hono } from 'hono';

...

api.openapi(
  createRoute({
    method: 'get',
    path: '/',
    request: {
      query: PaginationParams,
    },
    responses: {
      200: {
        content: {
          'application/json': {
             schema: APISchemaSuccess(z.array(WorkflowWithInstancesCountSchema)),
          },
        },
        description: 'List of all Workflows belonging to a account.',
      },
    },
  }),
  async (ctx) => {
    ...
  },
);

...

api.route('/:workflow_name', routes.workflows);
api.route('/:workflow_name/instances', routes.instances);
api.route('/:workflow_name/versions', routes.versions);

These Workers perform two different functions, but they share a large portion of their code and implement similar logic; once the request is authenticated and ready to travel to the next stage, they use the account ID to delegate the operation to a Durable Object called Account Controller.

// env.ACCOUNTS is the Account Controllers Durable Objects namespace
const accountStubId = c.env.ACCOUNTS.idFromName(accountId.toString());
const accountStub = c.env.ACCOUNTS.get(accountStubId);

As you can see, every account has its own Account Controller Durable Object.

Account Controllers

The Account Controller is a dedicated persisted database that stores the list of all the account’s workflows, versions, and instances. We scale to millions of account controllers, one per every Cloudflare account using Workflows, by leveraging the power of Durable Objects with SQLite backend.

Durable Objects (DOs) are single-threaded singletons that run in our data centers and are bound to a stateful storage API, in this case, SQLite. They are also Workers, just a special kind, and have access to all of our other APIs. This makes it easy to build consistent, highly available distributed applications with them.

Here’s what we get for free by using one Durable Object per Workflows account:

Sharding based on account boundaries aligns perfectly with the way we manage resources at Cloudflare internally. Also, due to the nature of DOs, there are other things that this model gets us for free: Not that we expect them, but eventual bugs or state inconsistencies during beta are confined to the affected account, and don’t impact everyone.
DO instances run close to the end user; Alice is in London and will call the config API through our LHR data center, while Bob is in Lisbon and will connect to LIS.
Because every account is a Worker, we can gradually upgrade them to new versions, starting with the internal users, thus derisking real customers.

Before SQLite, our only option was to use the Durable Object's key-value storage API, but having a relational database at our fingertips and being able to create tables and do complex queries is a significant enabler. For example, take a look at how we implement the internal method getWorkflow():

async function getWorkflow(accountId: number, workflowName: string) {
  try {
    const res = this.ctx.storage.transactionSync(() => {
      const cursor = Array.from(
        this.ctx.storage.sql.exec(
          `
                    SELECT *,
                    (SELECT class_name
                        FROM   versions
                        WHERE  workflow_id = w.id
                        ORDER  BY created_on DESC
                        LIMIT  1) AS class_name
                    FROM   workflows w
                    WHERE  w.name = ? 
                    `,
          workflowName
        )
      )[0] as Workflow;

      return cursor;
    });

    this.sendAnalytics(accountId, begin, "getWorkflow");
    return res as Workflow | undefined;
  } catch (err) {
    this.sendErrorAnalytics(accountId, begin, "getWorkflow");
    throw err;
  }
}

The other thing we take advantage of in Workflows is using the recently announced JavaScript-native RPC feature when communicating between components.

Before RPC, we had to fetch() between components, make HTTP requests, and serialize and deserialize the parameters and the payload. Now, we can async call the remote object's method as if it was local. Not only does this feel more natural and simplify our logic, but it's also more efficient, and we can take advantage of TypeScript type-checking when writing code.

This is how the Configuration API would call the Account Controller’s countWorkflows() method before:

const resp = await accountStub.fetch(
      "https://controller/count-workflows",
      {
        method: "POST",
        headers: {
          "Content-Type": "application/json; charset=utf-8",
        },
        body: JSON.stringify({ accountId }),
      },
    );

if (!resp.ok) {
  return new Response("Internal Server Error", { status: 500 });
}

const result = await resp.json();
const total_count = result.total_count;

This is how we do it using RPC:

const total_count = await accountStub.countWorkflows(accountId);

The other powerful feature of our RPC system is that it supports passing not only Structured Cloneable objects back and forth but also entire classes. More on this later.

Let’s move on to Engine.

Engine and instance

Every instance of a workflow runs alongside an Engine instance. The Engine is responsible for starting up the user’s workflow entry point, executing the steps on behalf of the user, handling their results, and tracking the workflow state until completion.

When we started thinking about the Engine, we thought about modeling it after a state machine, and that was what our initial prototypes looked like. However, state machines require an ahead-of-time understanding of the userland code, which implies having a build step before running them. This is costly at scale and introduces additional complexity.

A few iterations later, we had another idea. What if we could model the engine as a game loop?

Unlike other computer programs, games operate regardless of a user's input. The game loop is essentially a sequence of tasks that implement the game's logic and update the display, typically one loop per video frame. Here’s an example of a game loop in pseudo-code:

while (game in running)
    check for user input
    move graphics
    play sounds
end while

Well, an oversimplified version of our Workflow engine would look like this:

while (last step not completed)
    iterate every step
       use memoized cache as response if the step has run already
       continue running step or timer if it hasn't finished yet
end while

A workflow is indeed a loop that keeps on going, performing the same sequence of logical tasks until the last step completes.

The Engine and the instance run hand-in-hand in a one-to-one relationship. The first is managed, and part of the platform. It uses SQLite and other platform APIs internally, and we can constantly add new features, fix bugs, and deploy new versions, while keeping everything transparent to the end user. The second is the actual account-owned Worker script that declares the Workflow steps.

For example, when someone passes a callback into step.do():

export class MyWorkflow extends WorkflowEntrypoint {
  async run(event: WorkflowEvent, step: WorkflowStep) {
    step.do('step1', () => { ... });
  }
}

We switch execution over to the Engine. Again, this is possible because of the power of JS RPC. Besides passing Structured Cloneable objects back and forth, JS RPC allows us to create and pass entire application-defined classes that extend the built-in RpcTarget. So this is what happens behind the scenes when your Instance calls step.do() (simplified):

export class Context extends RpcTarget {

  async do(name: string, callback: () => Promise): Promise {

    // First we check we have a cache of this step.do() already
    const maybeResult = await this.#state.storage.get(name);

    // We return the cache if it exists
    if (maybeValue) { return maybeValue; }

    // Else we run the user callback
    return doWrapper(callback);
  }

}

Here’s a more complete diagram of the Engine’s step.do() lifecycle:

Again, this diagram only partially represents everything we do in the Engine; things like logging for observability or handling exceptions are missing, and we don't get into the details of how queuing is implemented. However, it gives you a good idea of how the Engine abstracts and handles all the complexities of completing a step under the hood, allowing us to expose a simple-to-use API to end users.

Also, it's worth reiterating that every workflow instance is an Engine behind the scenes, and every Engine is an SQLite-backed Durable Object. This ensures that every instance runtime and state are isolated and independent of each other and that we can effortlessly scale to run billions of workflow instances, a solved problem for Durable Objects.

Durability

Durable Execution is all the rage now when we talk about workflow engines, and ours is no exception. Workflows are typically long-lived processes that run multiple functions in sequence where anything can happen. Those functions can time out or fail because of a remote server error or a network issue and need to be retried. A workflow engine ensures that your application runs smoothly and completes regardless of the problems it encounters.

Durability means that if and when a workflow fails, the Engine can re-run it, resume from the last recorded step, and deterministically re-calculate the state from all the successful steps' cached responses. This is possible because steps are stateful and idempotent; they produce the same result no matter how many times we run them, thus not causing unintended duplicate effects like sending the same invoice to a customer multiple times.

We ensure durability and handle failures and retries by sharing the same technique we use for a step.sleep() that requires sleeping for days or months: a combination of using scheduler.wait(), a method of the upcoming WICG Scheduling API that we already support, and Durable Objects alarms, which allow you to schedule the Durable Object to be woken up at a time in the future.

These two APIs allow us to overcome the lack of guarantees that a Durable Object runs forever, giving us complete control of its lifecycle. Since every state transition through userland code persists in the Engine’s strongly consistent SQLite, we track timestamps when a step begins execution, its attempts (if it needs retries), and its completion.

This means that steps pending if a Durable Object is evicted — perhaps due to a two-month-long timer — get rerun on the next lifetime of the Engine (with its cache from the previous lifetime hydrated) that is triggered by an alarm set with the timestamp of the next expected state transition.

Real-life workflow, step by step

Let's walk through an example of a real-life application. You run an e-commerce website and would like to send email reminders to your customers for forgotten carts that haven't been checked out in a few days.

What would typically have to be a combination of a queue, a cron job, and querying a database table periodically can now simply be a Workflow that we start on every new cart:

import {
  WorkflowEntrypoint,
  WorkflowEvent,
  WorkflowStep,
} from "cloudflare:workers";
import { sendEmail } from "./legacy-email-provider";

type Params = {
  cartId: string;
};

type Env = {
  DB: D1Database;
};

export class Purchase extends WorkflowEntrypoint {
  async run(
    event: WorkflowEvent,
    step: WorkflowStep
  ): Promise {
    await step.sleep("wait for three days", "3 days");

    // Retrieve cart from D1
    const cart = await step.do("retrieve cart from database", async () => {
      const { results } = await this.env.DB.prepare(`SELECT * FROM cart WHERE id = ?`)
        .bind(event.payload.cartId)
        .all();
      return results[0];
    });

    if (!cart.checkedOut) {
      await step.do("send an email", async () => {
        await sendEmail("reminder", cart);
      });
    }
  }
}

This works great. However, sometimes the sendEmail function fails due to an upstream provider erroring out. While step.do automatically retries with a reasonable default configuration, we can define our settings:

if (cart.isComplete) {
  await step.do(
    "send an email",
    {
      retries: {
        limit: 5,
        delay: "1 min",
        backoff: "exponential",
      },
    },
    async () => {
      await sendEmail("reminder", cart);
    }
  );
}

Managing Workflows

Workflows allows us to create and manage workflows using four different interfaces:

Using our REST HTTP API available on Cloudflare’s API catalog
Using Wrangler, Cloudflare's developer platform command-line tool
Programmatically inside a Worker using bindings
Using our Web UI in the dashboard

The HTTP API makes it easy to trigger new instances of workflows from any system, even if it isn’t on Cloudflare, or from the command line. For example:

curl --request POST \
  --url https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workflows/purchase-workflow/instances/$CART_INSTANCE_ID \
  --header 'Authorization: Bearer $ACCOUNT_TOKEN \
  --header 'Content-Type: application/json' \
  --data '{
	"id": "$CART_INSTANCE_ID",
	"params": {
		"cartId": "f3bcc11b-2833-41fb-847f-1b19469139d1"
	}
  }'

Wrangler goes one step further and gives us a friendlier set of commands to interact with workflows with fancy formatted outputs without needing to authenticate with tokens. Type npx wrangler workflows for help, or:

npx wrangler workflows trigger purchase-workflow '{ "cartId": "f3bcc11b-2833-41fb-847f-1b19469139d1" }'

Furthermore, Workflows has first-party support in wrangler, and you can test your instances locally. A Workflow is similar to a regular WorkerEntrypoint in your Worker, which means that wrangler dev just naturally works.

❯ npx wrangler dev

 ⛅️ wrangler 3.82.0
----------------------------

Your worker has access to the following bindings:
- Workflows:
  - CART_WORKFLOW: EcommerceCartWorkflow
⎔ Starting local server...
[wrangler:inf] Ready on http://localhost:8787
╭───────────────────────────────────────────────╮
│  [b] open a browser, [d] open devtools        │
╰───────────────────────────────────────────────╯

Workflow APIs are also available as a Worker binding. You can interact with the platform programmatically from another Worker script in the same account without worrying about permissions or authentication. You can even have workflows that call and interact with other workflows.

import { WorkerEntrypoint } from "cloudflare:workers";

type Env = { DEMO_WORKFLOW: Workflow };
export default class extends WorkerEntrypoint {
  async fetch() {
    // Pass in a user defined name for this instance
    // In this case, we use the same as the cartId
    const instance = await this.env.DEMO_WORKFLOW.create({
      id: "f3bcc11b-2833-41fb-847f-1b19469139d1",
      params: {
          cartId: "f3bcc11b-2833-41fb-847f-1b19469139d1",
      }
    });
  }
  async scheduled() {
    // Restart errored out instances in a cron
    const instance = await this.env.DEMO_WORKFLOW.get(
      "f3bcc11b-2833-41fb-847f-1b19469139d1"
    );
    const status = await instance.status();
    if (status.error) {
      await instance.restart();
    }
  }
}

Observability

Having good observability and data on often long-lived asynchronous tasks is crucial to understanding how we're doing under normal operation and, more importantly, when things go south, and we need to troubleshoot problems or when we are iterating on code changes.

We designed Workflows around the philosophy that there is no such thing as too much logging. You can get all the SQLite data for your workflow and its instances by calling the REST APIs. Here is the output of an instance:

{
  "success": true,
  "errors": [],
  "messages": [],
  "result": {
    "status": "running",
    "params": {},
    "trigger": { "source": "api" },
    "versionId": "ae042999-39ff-4d27-bbcd-22e03c7c4d02",
    "queued": "2024-10-21 17:15:09.350",
    "start": "2024-10-21 17:15:09.350",
    "end": null,
    "success": null,
    "steps": [
      {
        "name": "send email",
        "start": "2024-10-21 17:15:09.411",
        "end": "2024-10-21 17:15:09.678",
        "attempts": [
          {
            "start": "2024-10-21 17:15:09.411",
            "end": "2024-10-21 17:15:09.678",
            "success": true,
            "error": null
          }
        ],
        "config": {
          "retries": { "limit": 5, "delay": 1000, "backoff": "constant" },
          "timeout": "15 minutes"
        },
        "output": "celso@example.com",
        "success": true,
        "type": "step"
      },
      {
        "name": "sleep-1",
        "start": "2024-10-21 17:15:09.763",
        "end": "2024-10-21 17:17:09.763",
        "finished": false,
        "type": "sleep",
        "error": null
      }
    ],
    "error": null,
    "output": null
  }
}

As you can see, this is essentially a dump of the instance engine SQLite in JSON. You have the errors, messages, current status, and what happened with every step, all time stamped to the millisecond.

It's one thing to get data about a specific workflow instance, but it's another to zoom out and look at aggregated statistics of all your workflows and instances over time. Workflows data is available through our GraphQL Analytics API, so you can query it in aggregate and generate valuable insights and reports. In this example we ask for aggregated analytics about the wall time of all the instances of the “e-commerce-carts” workflow:

{
  viewer {
    accounts(filter: { accountTag: "febf0b1a15b0ec222a614a1f9ac0f0123" }) {
      wallTime: workflowsAdaptiveGroups(
        limit: 10000
        filter: {
          datetimeHour_geq: "2024-10-20T12:00:00.000Z"
          datetimeHour_leq: "2024-10-21T12:00:00.000Z"
          workflowName: "e-commerce-carts"
        }
        orderBy: [count_DESC]
      ) {
        count
        sum {
          wallTime
        }
        dimensions {
          date: datetimeHour
        }
      }
    }
  }
}

For convenience, you can evidently also use Wrangler to describe a workflow or an instance and get an instant and beautifully formatted response:

sid ~ npx wrangler workflows instances describe purchase-workflow latest

 ⛅️ wrangler 3.80.4

Workflow Name:         purchase-workflow
Instance Id:           d4280218-7756-41d2-bccd-8d647b82d7ce
Version Id:            0c07dbc4-aaf3-44a9-9fd0-29437ed11ff6
Status:                ✅ Completed
Trigger:               🌎 API
Queued:                14/10/2024, 16:25:17
Success:               ✅ Yes
Start:                 14/10/2024, 16:25:17
End:                   14/10/2024, 16:26:17
Duration:              1 minute
Last Successful Step:  wait for three days
Output:                false
Steps:

  Name:      wait for three days
  Type:      💤 Sleeping
  Start:     14/10/2024, 16:25:17
  End:       17/10/2024, 16:25:17
  Duration:  3 day

And finally, we worked really hard to get you the best dashboard UI experience when navigating Workflows data.

So, how much does it cost?

It’d be painful if we introduced a powerful new way to build Workers applications but made it cost prohibitive.

Workflows is priced just like Cloudflare Workers, where we introduced CPU-based pricing: only on active CPU time and requests, not duration (aka: wall time).

^{Workers Standard pricing model}

This is especially advantageous when building the long-running, multi-step applications that Workflows enables: if you had to pay while your Workflow was sleeping, waiting on an event, or making a network call to an API, writing the “right” code would be at odds with writing affordable code.

There’s also no need to keep a Kubernetes cluster or a group of virtual machines running (and burning a hole in your wallet): we manage the infrastructure, and you only pay for the compute your Workflows consume.

What’s next?

Today, after months of developing the platform, we are announcing the open beta program, and we couldn't be more excited to see how you will be using Workflows. Looking forward, we want to do things like triggering instances from queue messages and have other ideas, but at the same time, we are certain that your feedback will help us shape the roadmap ahead.

We hope that this blog post gets you thinking about how to use Workflows for your next application, but also that it inspires you on what you can build on top of Workers. Workflows as a platform is entirely built on top of Workers, its resources, and APIs. Anyone can do it, too.

To chat with the team and other developers building on Workflows, join the #workflows-beta channel on the Cloudflare Developer Discord, and keep an eye on the Workflows changelog during the beta. Otherwise, visit the Workflows tutorial to get started.

If you're an engineer, look for opportunities to work with us and help us improve Workflows or build other products.

Billions and billions (of logs): scaling AI Gateway with the Cloudflare Developer Platform

2024-10-24T13:00:00.000Z

With the rapid advancements occurring in the AI space, developers face significant challenges in keeping up with the ever-changing landscape. New models and providers are continuously emerging, and understandably, developers want to experiment and test these options to find the best fit for their use cases. This creates the need for a streamlined approach to managing multiple models and providers, as well as a centralized platform to efficiently monitor usage, implement controls, and gather data for optimization.

AI Gateway is specifically designed to address these pain points. Since its launch in September 2023, AI Gateway has empowered developers and organizations by successfully proxying over 2 billion requests in just one year, as we highlighted during September’s Birthday Week. With AI Gateway, developers can easily store, analyze, and optimize their AI inference requests and responses in real time.

With our initial architecture, AI Gateway faced a significant challenge: the logs, those critical trails of data interactions between applications and AI models, could only be retained for 30 minutes. This limitation was not just a minor inconvenience; it posed a substantial barrier for developers and businesses needing to analyze long-term patterns, ensure compliance, or simply debug over more extended periods.

In this post, we'll explore the technical challenges and strategic decisions behind extending our log storage capabilities from 30 minutes to being able to store billions of logs indefinitely. We'll discuss the challenges of scale, the intricacies of data management, and how we've engineered a system that not only meets the demands of today, but is also scalable for the future of AI development.

Background

AI Gateway is built on Cloudflare Workers, a serverless platform that runs on the Cloudflare network, allowing developers to write small JavaScript functions that can execute at the point of need, near the user, on Cloudflare's vast network of data centers, without worrying about platform scalability.

Our customers use multiple providers and models and are always looking to optimize the way they do inference. And, of course, in order to evaluate their prompts, performance, cost, and to troubleshoot what’s going on, AI Gateway’s customers need to store requests and responses. New requests show up within 15 seconds and customers can check a request’s cost, duration, number of tokens, and provide their feedback (thumbs up or down).

This scales in a way where an account can have multiple gateways and each gateway has its own settings. In our first implementation, a backend worker was responsible for storing Real Time Logs and other background tasks. However, in the rapidly evolving domain of artificial intelligence, where real-time data is as precious as the insights it provides, managing log data efficiently becomes paramount. We recognized that to truly empower our users, we needed to offer a solution where logs weren't just transient records but could be stored permanently. Permanent log storage means developers can now track the performance, security, and operational insights of their AI applications over time, enabling not only immediate troubleshooting but also longitudinal studies of AI behavior, usage trends, and system health.

The diagram above describes our old architecture, which could only store 30 minutes of data.

Tracing the path of a request through the AI Gateway, as depicted in the sequence above:

A developer sends a new inference request, which is first received by our Gateway Worker.
The Gateway Worker then performs several checks: it looks for cached results, enforces rate limits, and verifies any other configurations set by the user for their gateway. Provided all conditions are met, it forwards the request to the selected inference provider (in this diagram, OpenAI).
The inference provider processes the request and sends back the response.
Simultaneously, as the response is relayed back to the developer, the request and response details are also dispatched to our Backend Worker. This worker's role is to manage and store the log of this transaction.

The challenge: Store two billion logs

First step: real-time logs

Initially, the AI Gateway project stored both request metadata and the actual request bodies in a D1 database. This approach facilitated rapid development in the project's infancy. However, as customer engagement grew, the D1 database began to fill at an accelerating rate, eventually retaining logs for only 30 minutes at a time.

To mitigate this, we first optimized the database schema, which extended the log retention to one hour. However, we soon encountered diminishing returns due to the sheer volume of byte data from the request bodies. Post-launch, it became clear that a more scalable solution was necessary. We decided to migrate the request bodies to R2 storage, significantly alleviating the data load on D1. This adjustment allowed us to incrementally extend log retention to 24 hours.

Consequently, D1 functioned primarily as a log index, enabling users to search and filter logs efficiently. When users needed to view details or download a log, these actions were seamlessly proxied through to R2.

This dual-system approach provided us with the breathing room to contemplate and develop more sophisticated storage solutions for the future.

Second step: persistent logs and Durable Object transactional storage

As our traffic surged, we encountered a growing number of requests from customers wanting to access and compare older logs.

Upon learning that the Durable Objects team was seeking beta testers for their new Durable Objects with SQLite, we eagerly signed up.

Originally, we considered Durable Objects as the ideal solution for expanding our log storage capacity, which required us to shard the logs by a unique string. Initially, this string was the account ID, but during a mid-development load test, we hit a cap at 10 million logs per Durable Object. This limitation meant that each account could only support up to this number of logs.

Given our commitment to the DO migration, we saw an opportunity rather than a constraint. To overcome the 10 million log limit per account, we refined our approach to shard by both account ID and gateway name. This adjustment effectively raised the storage ceiling from 10 million logs per account to 10 million per gateway. With the default setting allowing each account up to 10 gateways, the potential storage for each account skyrocketed to 100 million logs.

This strategic pivot not only enabled us to store a significantly larger number of logs. But also enhanced our flexibility in gateway management. Now, when a gateway is deleted, we can simply remove the corresponding Durable Object.

Additionally, this sharding method isolates high-volume request scenarios. If one customer's heavy usage slows down log insertion, it only impacts their specific Durable Object, thereby preserving performance for other customers.

Taking a glance at the revised architecture diagram, we replaced the Backend Worker with our newly integrated Durable Object. The rest of the request flow remains unchanged, including the concurrent response to the user and the interaction with the Durable Object, which occurs in the fourth step.

Leveraging Cloudflare’s network, our Gateway Worker operates near the user's location, which in turn positions the user's Durable Object close by. This proximity significantly enhances the speed of log insertion and query operations.

Third step: managing thousands of Durable Objects

As the number of users and requests on AI Gateway grows, managing each unique Durable Object (DO) becomes increasingly complex. New customers join continuously, and we needed an efficient method to track each DO, ensure users stay within their 10 gateway limit, and manage the storage capacity for free users.

To address these challenges, we introduced another layer of control with a new Durable Object we've named the Account Manager. The primary function of the Account Manager is straightforward yet crucial: it keeps user activities in check.

Here's how it works: before any Gateway commits a new log to permanent storage, it consults the Account Manager. This check determines whether the gateway is allowed to insert the log based on the user's current usage and entitlements. The Account Manager uses its own SQLite database to verify the total number of rows a user has and their service level. If all checks pass, it signals the Gateway that the log can be inserted. It was paramount to guarantee that this entire validation process occurred in the background, ensuring that the user experience remains seamless and uninterrupted.

The Account Manager stays updated by periodically receiving data from each Gateway’s Durable Object. Specifically, after every 1000 inference requests, the Gateway sends an update on its total rows to the Account Manager, which then updates its local records. This system ensures that the Account Manager has the most current data when making its decisions.

Additionally, the Account Manager is responsible for monitoring customer entitlements. It tracks whether an account is on a free or paid plan, how many gateways a user is permitted to create, and the log storage capacity allocated to each gateway.

Through these mechanisms, the Account Manager not only helps in maintaining system integrity but also ensures fair usage across all users of AI Gateway.

AI evaluations and Durable Objects sharding

As we continue to develop evaluations to fully automatic and, in the future, use Large Language Models (LLMs), we are now taking the first step towards this goal and launching the open beta phase of comprehensive AI evaluations, centered on Human-in-the-Loop feedback.

This feature empowers users to create bespoke datasets from their application logs, thereby enabling them to score and evaluate the performance, speed, and cost-effectiveness of their models, with a primary focus on LLMs and automated scoring, analyzing the performance of LLMs, providing developers with objective, data-driven insights to refine their models.

To do this, developers require a reliable logging mechanism that persists logs from multiple gateways, storing up to 100 million logs in total (10 million logs per gateway, across 10 gateways). This represents a significant volume of data, as each request made through the AI Gateway generates a log entry, with some log entries potentially exceeding 50 MB in size.

This necessity leads us to work on the expansion of log storage capabilities. Since log storage is limited to 10 million logs per gateway, in future iterations, we aim to scale this capacity by implementing sharded Durable Objects (DO), allowing multiple Durable Objects per gateway to handle and store logs. This scaling strategy will enable us to store significantly larger volumes of logs, providing richer data for evaluations (using LLMs as a judge or from user input), all through AI Gateway.

Coming Soon

We are working on improving our existing Universal Endpoint, the next step on an enhanced solution that builds on existing fallback mechanisms to offer greater resilience, flexibility, and intelligence in request management.

Currently, when a provider encounters an error or is unavailable, our system falls back to an alternative provider to ensure continuity. The improved Universal Endpoint takes this a step further by introducing automatic retry capabilities, allowing failed requests to be reattempted before fallback is triggered. This significantly improves reliability by handling transient errors and increasing the likelihood of successful request fulfillment. It will look something like this:

curl --location 'https://aig.example.com/' \
--header 'CF-AIG-TOKEN: Bearer XXXX' \
--header 'Content-Type: application/json' \
--data-raw '[
    {
        "id": "0001",
        "provider": "openai",
        "endpoint": "chat/completions",
        "headers": {
            "Authorization": "Bearer XXXX",
            "Content-Type": "application/json"
        },
        "query": {
            "model": "gpt-3.5-turbo",
            "messages": [
                {
                    "role": "user",
                    "content": "generate a prompt to create cloudflare random images"
                }
            ]
        },
        "option": {
            "retry": 2,
            "delay": 200,
            "onComplete": {
                "provider": "workers-ai",
                "endpoint": "@cf/stabilityai/stable-diffusion-xl-base-1.0",
                "headers": {
                    "Authorization": "Bearer A5UFQkHewHF1-sA3hTVQFaPxRuu5wmS0eJcCS_MC",
                    "Content-Type": "application/json"
                },
                "query": {
                    "messages": [
                        {
                            "role": "user",
                            "content": ""
                        }
                    ]
                }
            }
        }
    },
    {
        "provider": "workers-ai",
        "endpoint": "@cf/stabilityai/stable-diffusion-xl-base-1.0",
        "headers": {
            "Authorization": "Bearer XXXXXX",
            "Content-Type": "application/json"
        },
        "query": {
            "messages": [
                {
                    "role": "user",
                    "content": "create a image of a missing cat"
                }
            ]
        }
    }
]'

The request to the improved Universal Endpoint system demonstrates how it handles multiple providers with integrated retry mechanisms and fallback logic. In this example, the first request is sent to a provider like OpenAI, asking it to generate a text-to-image prompt. The “retry” option ensures that transient issues don’t result in immediate failure.

The system’s ability to seamlessly switch between providers while applying retry strategies ensures higher reliability and robustness in managing requests. By leveraging fallback logic, the Improved Universal Endpoint can dynamically adapt to provider failures, ensuring that tasks are completed successfully even in complex, multi-step workflows.

In addition to retry logic, we will have the ability to inspect requests and responses and make dynamic decisions based on the content of the result. This enables developers to create conditional workflows where the system can adapt its behavior depending on the nature of the response, creating a highly flexible and intelligent decision-making process.

If you haven’t yet used AI Gateway, check out our developer documentation on how to get started. If you have any questions, reach out on our Discord channel.

Durable Objects aren't just durable, they're fast: a 10x speedup for Cloudflare Queues

2024-10-24T13:00:00.000Z

Cloudflare Queues let a developer decouple their Workers into event-driven services. Producer Workers write events to a Queue, and consumer Workers are invoked to take actions on the events. For example, you can use a Queue to decouple an e-commerce website from a service which sends purchase confirmation emails to users. During 2024’s Birthday Week, we announced that Cloudflare Queues is now Generally Available, with significant performance improvements that enable larger workloads. To accomplish this, we switched to a new architecture for Queues that enabled the following improvements:

Median latency for sending messages has dropped from ~200ms to ~60ms
Maximum throughput for each Queue has increased over 10x, from 400 to 5000 messages per second
Maximum Consumer concurrency for each Queue has increased from 20 to 250 concurrent invocations

^{Median latency drops from ~200ms to ~60ms as Queues are migrated to the new architecture}

In this blog post, we'll share details about how we built Queues using Durable Objects and the Cloudflare Developer Platform, and how we migrated from an initial Beta architecture to a geographically-distributed, horizontally-scalable architecture for General Availability.

v1 Beta architecture

When initially designing Cloudflare Queues, we decided to build something simple that we could get into users' hands quickly. First, we considered leveraging an off-the-shelf messaging system such as Kafka or Pulsar. However, we decided that it would be too challenging to operate these systems at scale with the large number of isolated tenants that we wanted to support.

Instead of investing in new infrastructure, we decided to build on top of one of Cloudflare's existing developer platform building blocks: Durable Objects. Durable Objects are a simple, yet powerful building block for coordination and storage in a distributed system. In our initial v1 architecture, each Queue was implemented using a single Durable Object. As shown below, clients would send messages to a Worker running in their region, which would be forwarded to the single Durable Object hosted in the WNAM (Western North America) region. We used a single Durable Object for simplicity, and hosted it in WNAM for proximity to our centralized configuration API service.

One of a Queue's main responsibilities is to accept and store incoming messages. Sending a message to a v1 Queue used the following flow:

A client sends a POST request containing the message body to the Queues API at /accounts/:accountID/queues/:queueID/messages
The request is handled by an instance of the Queue Broker Worker in a Cloudflare data center running near the client.
The Worker performs authentication, and then uses Durable Objects idFromName API to route the request to the Queue Durable Object for the given queueID
The Queue Durable Object persists the message to storage before returning a success back to the client.

Durable Objects handled most of the heavy-lifting here: we did not need to set up any new servers, storage, or service discovery infrastructure. To route requests, we simply provided a queueID and the platform handled the rest. To store messages, we used the Durable Object storage API to put each message, and the platform handled reliably storing the data redundantly.

Consuming messages

The other main responsibility of a Queue is to deliver messages to a Consumer. Delivering messages in a v1 Queue used the following process:

Each Queue Durable Object maintained an alarm that was always set when there were undelivered messages in storage. The alarm guaranteed that the Durable Object would reliably wake up to deliver any messages in storage, even in the presence of failures. The alarm time was configured to fire after the user's selected max wait time, if only a partial batch of messages was available. Whenever one or more full batches were available in storage, the alarm was scheduled to fire immediately.
The alarm would wake the Durable Object, which continually looked for batches of messages in storage to deliver.
Each batch of messages was sent to a "Dispatcher Worker" that used Workers for Platforms dynamic dispatch to pass the messages to the queue() function defined in a user's Consumer Worker

This v1 architecture let us flesh out the initial version of the Queues Beta product and onboard users quickly. Using Durable Objects allowed us to focus on building application logic, instead of complex low-level systems challenges such as global routing and guaranteed durability for storage. Using a separate Durable Object for each Queue allowed us to host an essentially unlimited number of Queues, and provided isolation between them.

However, using only one Durable Object per queue had some significant limitations:

Latency: we created all of our v1 Queue Durable Objects in Western North America. Messages sent from distant regions incurred significant latency when traversing the globe.
Throughput: A single Durable Object is not scalable: it is single-threaded and has a fixed capacity for how many requests per second it can process. This is where the previous 400 messages per second limit came from.
Consumer Concurrency: Due to concurrent subrequest limits, a single Durable Object was limited in how many concurrent subrequests it could make to our Dispatcher Worker. This limited the number of queue() handler invocations that it could run simultaneously.

To solve these issues, we created a new v2 architecture that horizontally scales across multiple Durable Objects to implement each single high-performance Queue.

v2 Architecture

In the new v2 architecture for Queues, each Queue is implemented using multiple Durable Objects, instead of just one. Instead of a single region, we place Storage Shard Durable Objects in all available regions to enable lower latency. Within each region, we create multiple Storage Shards and load balance incoming requests amongst them. Just like that, we’ve multiplied message throughput.

Sending a message to a v2 Queue uses the following flow:

A client sends a POST request containing the message body to the Queues API at /accounts/:accountID/queues/:queueID/messages
The request is handled by an instance of the Queue Broker Worker running in a Cloudflare data center near the client.
The Worker:
- Performs authentication
- Reads from Workers KV to obtain a Shard Map that lists available storage shards for the given region and queueID
- Picks one of the region's Storage Shards at random, and uses Durable Objects idFromName API to route the request to the chosen shard
The Storage Shard persists the message to storage before returning a success back to the client.

In this v2 architecture, messages are stored in the closest available Durable Object storage cluster near the user, greatly reducing latency since messages don't need to be shipped all the way to WNAM. Using multiple shards within each region removes the bottleneck of a single Durable Object, and allows us to scale each Queue horizontally to accept even more messages per second. Workers KV acts as a fast metadata store: our Worker can quickly look up the shard map to perform load balancing across shards.

To improve the Consumer side of v2 Queues, we used a similar "scale out" approach. A single Durable Object can only perform a limited number of concurrent subrequests. In v1 Queues, this limited the number of concurrent subrequests we could make to our Dispatcher Worker. To work around this, we created a new Consumer Shard Durable Object class that we can scale horizontally, enabling us to execute many more concurrent instances of our users' queue() handlers.

Consumer Durable Objects in v2 Queues use the following approach:

Each Consumer maintains an alarm that guarantees it will wake up to process any pending messages. v2 Consumers are notified by the Queue's Coordinator (introduced below) when there are messages ready for consumption. Upon notification, the Consumer sets an alarm to go off immediately.
The Consumer looks at the shard map, which contains information about the storage shards that exist for the Queue, including the number of available messages on each shard.
The Consumer picks a random storage shard with available messages, and asks for a batch.
The Consumer sends the batch to the Dispatcher Worker, just like for v1 Queues.
After processing the messages, the Consumer sends another request to the Storage Shard to either "acknowledge" or "retry" the messages.

This scale-out approach enabled us to work around the subrequest limits of a single Durable Object, and increase the maximum supported concurrency level of a Queue from 20 to 250.

The Coordinator and “Control Plane”

So far, we have primarily discussed the "Data Plane" of a v2 Queue: how messages are load balanced amongst Storage Shards, and how Consumer Shards read and deliver messages. The other main piece of a v2 Queue is the "Control Plane", which handles creating and managing all the individual Durable Objects in the system. In our v2 architecture, each Queue has a single Coordinator Durable Object that acts as the brain of the Queue. Requests to create a Queue, or change its settings, are sent to the Queue's Coordinator.

The Coordinator maintains a Shard Map for the Queue, which includes metadata about all the Durable Objects in the Queue (including their region, number of available messages, current estimated load, etc.). The Coordinator periodically writes a fresh copy of the Shard Map into Workers KV, as pictured in step 1 of the diagram. Placing the shard map into Workers KV ensures that it is globally cached and available for our Worker to read quickly, so that it can pick a shard to accept the message.

Every shard in the system periodically sends a heartbeat to the Coordinator as shown in steps 2 and 3 of the diagram. Both Storage Shards and Consumer Shards send heartbeats, including information like the number of messages stored locally, and the current load (requests per second) that the shard is handling. The Coordinator uses this information to perform autoscaling. When it detects that the shards in a particular region are overloaded, it creates additional shards in the region, and adds them to the shard map in Workers KV. Our Worker sees the updated shard map and naturally load balances messages across the freshly added shards. Similarly, the Coordinator looks at the backlog of available messages in the Queue, and decides to add more Consumer shards to increase Consumer throughput when the backlog is growing. Consumer Shards pull messages from Storage Shards for processing as shown in step 4 of the diagram.

Switching to a new scalable architecture allowed us to meet our performance goals and take Queues to GA. As a recap, this new architecture delivered these significant improvements:

P50 latency for writing to a Queue has dropped from ~200ms to ~60ms.
Maximum throughput for a Queue has increased from 400 to 5000 messages per second.
Maximum consumer concurrency has increased from 20 to 250 invocations.

What's next for Queues

We plan on leveraging the performance improvements in the new beta version of Durable Objects which use SQLite to continue to improve throughput/latency in Queues.
We will soon be adding message management features to Queues so that you can take actions to purge messages in a queue, pause consumption of messages, or “redrive”/move messages from one queue to another (for example messages that have been sent to a Dead Letter Queue could be “redriven” or moved back to the original queue).
Work to make Queues the "event hub" for the Cloudflare Developer Platform:
- Create a low-friction way for events emitted from other Cloudflare services with event schemas to be sent to Queues.
- Build multi-Consumer support for Queues so that Queues are no longer limited to one Consumer per queue.

To start using Queues, head over to our Getting Started guide.

Do distributed systems like Cloudflare Queues and Durable Objects interest you? Would you like to help build them at Cloudflare? We're Hiring!

AWS data center latencies, visualized

2024-10-24T03:18:23.000Z

EC2 Image Builder now supports building and testing macOS images

2024-10-23T17:52:19.000Z

I’m thrilled to announce macOS support in EC2 Image Builder. This new capability allows you to create and manage machine images for your macOS workloads in addition to the existing support for Windows and Linux.

A golden image is a bootable disk image, also called an Amazon Machine Image (AMI), pre-installed with the operating system and all the tools required for your workloads. In the context of a continuous integration and continuous deployment (CI/CD) pipeline, your golden image most probably contains the specific version of your operating system (macOS) and all required development tools and libraries to build and test your applications (Xcode, Fastlane, and so on.)

Developing and manually managing pipelines to build macOS golden images is time-consuming and diverts talented resources from other tasks. And when you have existing pipelines to build Linux or Windows images, you need to use different tools for creating macOS images, leading to a disjointed workflow.

For these reasons, many of you have been asking for the ability to manage your macOS images using EC2 Image Builder. You want to consolidate your image pipelines across operating systems and take advantage of the automation and cloud-centered integrations that EC2 Image Builder provides.

By adding macOS support to EC2 Image Builder, you can now streamline your image management processes and reduce the operational overhead of maintaining macOS images. EC2 Image Builder takes care of testing, versioning, and validating the base images at scale, saving you the costs associated with maintaining your preferred macOS versions.

Let’s see it in action
Let’s create a pipeline to create a macOS AMI with Xcode 16. You can follow a similar process to install Fastlane on your AMIs.

At a high level, there are four main steps.

I define a component for each tool I want to install. A component is a YAML document that tells EC2 Image Builder what application to install and how. In this example, I create a custom component to install Xcode. If you want to install Fastlane, you create a second component. I use the ExecuteBash action to enter the shell commands required to install Xcode.
I define a recipe. A recipe starts from a base image and lists the components I want to install on it.
I define the infrastructure configuration I want to use to build my image. This defines the pool of Amazon Elastic Compute Cloud (Amazon EC2) instances to build the image. In my case, I allocate an EC2 Mac Dedicated Host in my account and reference it in the infrastructure configuration.
I create a pipeline and a schedule to run on the infrastructure with the given recipes and an image workflow. I test the output AMI and deliver it at the chosen destination (my account or another account)

It’s much easier than it sounds. I’ll show you the steps in the AWS Management Console. I can also configure EC2 Image Builder with the AWS Command Line Interface (AWS CLI) or write code using one of our AWS SDKs.

Step 1: Create a component
I open the console and select EC2 Image Builder, then Components, and finally Create component.

I select a base Image operating system and the Compatible OS Versions. Then, I enter a Component name and Component version. I select Define document content and enter this YAML as Content.

name: InstallXCodeDocument
description: This downloads and installs Xcode. Be sure to run `xcodeinstall authenticate -s us-east-1` from your laptop first.
schemaVersion: 1.0

phases:
  - name: build
    steps:
      - name: InstallXcode
        action: ExecuteBash
        inputs:
          commands:
             - sudo -u ec2-user /opt/homebrew/bin/brew tap sebsto/macos
             - sudo -u ec2-user /opt/homebrew/bin/brew install xcodeinstall
             - sudo -u ec2-user /opt/homebrew/bin/xcodeinstall download -s us-east-1 --name "Xcode 16.xip"
             - sudo -u ec2-user /opt/homebrew/bin/xcodeinstall install --name "Xcode 16.xip"
  
  - name: validate
    steps:
      - name: TestXcode
        action: ExecuteBash
        inputs:
          commands:
            -  xcodebuild -version && xcode-select -p

I use a tool I wrote to download and install Xcode from the command line. xcodeinstall integrates with AWS Secrets Manager to securely store authentication web tokens. Before running the pipeline, I authenticate from my laptop with the command xcodeinstall authenticate -s us-east-1. This command starts a session with Apple server’s and stores the session token in Secrets Manager. xcodeinstall uses this token during the image creation pipeline to download Xcode.

When you use xcodeinstall with Secrets Manager, you must give permission to your pipeline to access the secrets. Here is the policy document I added to the role attached to the EC2 instance used by EC2 Image Builder (in the following infrastructure configuration).

{
	"Sid": "xcodeinstall",
	"Effect": "Allow",
	"Action": [
            "secretsmanager:GetSecretValue"
            "secretsmanager:PutSecretValue"
        ],
	"Resource": "arn:aws:secretsmanager:us-east-1::secret:xcodeinstall*"
}

To test and debug these components locally, without having to wait for long cycle to start and recycle the EC2 Mac instance, you can use the AWS Task Orchestrator and Executor (AWSTOE) command.

Step 2: Create a recipe
The next step is to create a recipe. On the console, I select Image recipes and Create image recipe.

I select macOS as the base Image Operating System. I choose macOS Sonoma ARM64 as Image name.

In the Build components section, I select the Xcode 16 component I just created during step 1.

Finally, I make sure the volume is large enough to store the operating system, Xcode, and my builds. I usually select a 500 Gb gp3 volume.

Steps 3 and 4: Create the pipeline (and the infrastructure configuration)
On the EC2 Image Builder page, I select Image pipelines and Create image pipeline. I give my pipeline a name and select a Build schedule. For this demo, I select a manual trigger.

Then, I select the recipe I just created (Sonoma-Xcode).

I chose Default workflows for Define image creation process (not shown for brevity).

I create or select an existing infrastructure configuration. In the context of building macOS images, you have to allocate Amazon EC2 Dedicated Hosts first. This is where I choose the instance type that EC2 Image Builder will use to create the AMI. I may also optionally select my virtual private cloud (VPC), security group, AWS Identity and Access Management (IAM) roles with permissions required during the preparation of the image, key pair, and all the parameters I usually select when I start an EC2 instance.

Finally, I select where I want to distribute the output AMI. By default, it stays on my account. But I can also share or copy it to other accounts.

Run the pipeline
Now I’m ready to run the pipeline. I select Image pipelines, then I select the pipeline I just created (Sonoma-Xcode). From the Actions menu, I select Run pipeline.

I can observe the progress and the detailed logs from Amazon CloudWatch.

After a while, the AMI is created and ready to use.

Testing my AMI
To finish the demo, I start an EC2 Mac instance with the AMI I just created (remember to allocate a Dedicated Host first or to reuse the one you used for EC2 Image Builder).

Once the instance is started, I connect to it using secure shell (SSH) and verify that Xcode is correctly installed.

Pricing and availability
EC2 Image Builder for macOS is now available in all AWS Regions where EC2 Mac instances are available: US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Mumbai, Seoul, Singapore, Sydney, Tokyo), and Europe (Frankfurt, Ireland, London, Stockholm) (not all Mac instance types are available in all Regions).

It comes at no additional cost, and you’re only charged for the resources in use during the pipeline execution, namely the time your EC2 Mac Dedicated Host is allocated, with a minimum of 24 hours.

The preview of macOS support in EC2 Image Builder allows you to consolidate your image pipelines, automate your golden image creation processes, and use the benefits of cloud-focused integrations on AWS. As the EC2 Mac platform continues to expand with more instance types, this new capability positions EC2 Image Builder as a comprehensive solution for image management across Windows, Linux, and macOS.

Create your first pipeline today!

Nix at work: FlakeHub Cache and private flakes

2024-10-23T17:28:39.000Z

BazelCon 2024 Recap

2024-10-23T14:50:10.000Z

Fearless SSH: Short-lived certificates bring Zero Trust to infrastructure

2024-10-23T09:44:36.000Z

BazelCon 2024 recap

2024-10-22T21:55:22.000Z

Measuring developer experience with the HEART Framework: A guide for platform engineers

2024-10-22T17:00:00.000Z

At the end of the day, developers build, test, deploy and maintain software. But like with lots of things, it’s about the journey, not the destination.

Among platform engineers, we sometimes refer to that journey as the developer experience (DX), which encompasses how developers feel and interact with the tools and services they use throughout the software build, test, deployment and maintenance process.

Prioritizing DX is essential: Frustrated developers lead to inefficiency and talent loss as well as to shadow IT. Conversely, a positive DX drives innovation, community, and productivity. And if you want to provide a positive DX, you need to start measuring how you’re doing.

At PlatformCon 2024, I gave a talk entitled "Improving your developers' platform experience by applying Google frameworks and methods” where I spoke about Google’s HEART Framework, which provides a holistic view of your organization's developers’ experience through actionable data.

In this article, I will share ideas on how you can apply the HEART framework to your Platform Engineering practice, to gain a more comprehensive view of your organization’s developer experience. But before I do that, let me explain what the HEART Framework is.

aside_block: ), ('btn_text', 'Get started for free'), ('href', 'https://console.cloud.google.com/freetrial?redirectPath=/welcome'), ('image', None)])]>

The HEART Framework: an introduction

In a nutshell, HEART measures developer behaviors and attitudes from their experience of your platform and provides you with insights into what’s going on behind the numbers, by defining specific metrics to track progress towards goals. This is beneficial because continuous improvements through feedback are vital components of a platform engineering journey, helping both platform and application product teams make decisions that are data-driven and user-centered.

However, HEART is not a data collection tool in and of itself; rather, it’s a user-sentiment framework for selecting the right metrics to focus on based on product or platform objectives. It balances quantitative or empirical data, e.g., number of active portal users, with qualitative or subjective insights such as "My users feel the portal navigation is confusing." In other words, consider HEART as a framework or methodology for assessing user experience, rather than a specific tool or assessment. It helps you decide what to measure, not how to measure it.

Let’s take a look at each of these in more detail.

Happiness: Do users actually enjoy using your product?

Highlight: Gathering and analyzing developer feedback

Subjective metrics:

Surveys: Conduct regular surveys to gather feedback about overall satisfaction, ease of use, and pain points. Toil negatively affects developer satisfaction and morale. Repetitive, manual work can lead to frustration burnout and decreased happiness with the platform.
Feedback mechanisms: Establish easy ways for developers to provide direct feedback on specific features or areas of the platform like Net Promoter Score (NPS) or Customer Satisfaction surveys (CSAT).
Collect open-ended feedback from developers through interviews and user groups.
Sentiment analysis: Analyze developer sentiment expressed in feedback channels, support tickets and online communities.

System metrics:

Feature requests: Track the number and types of feature requests submitted by developers. This provides insights into their needs and desires and can help you prioritize improvements that will enhance happiness.

Watch out for: While platforms can boost developer productivity, they might not necessarily contribute to developer job satisfaction. This warrants further investigation, especially if your research suggests that your developers are unhappy.

Engagement: What is the developer breadth and quality of platform experience?

Highlight: Frequency of interaction between platform engineers with developers and quality of interaction — intensity and quality of interaction with the platform, participation on chat channels, training, dual ownership of golden paths, joint troubleshooting, engaging in architectural design discussions, and the breadth of interaction by everyone from new hires through to senior developers.

Subjective metrics:

Survey for quality of interaction — focus on depth and type of interaction whether through chat channel, trainings, dual ownership of golden paths, joint troubleshooting, or architectural design discussions
High toil can reduce developer engagement with the platform. When developers spend excessive amounts of time on tedious tasks, they are less likely to explore new features, experiment, and contribute to the platform's evolution.

System metrics:

Active users: Track daily, weekly, and monthly active developers and how long they spend on tasks.
Usage patterns: Analyze the most used platform features, tools, and portal resources.
Frequency of interaction between platform engineers with developers.
Breadth of user engagement: Track onboarding time for new hires to reach proficiency, measure the percentage of senior developers actively contributing to golden paths or portal functionality.

Watch out for: Don’t confuse engagement with satisfaction. Developers may rate the platform highly in surveys, but usage data might reveal low frequency of interaction with core features or a limited subset of teams actively using the platform. Ask them “How has the platform changed your daily workflow?” rather than "Are you satisfied with the platform?”

Adoption: What is the platform growth rate and developer feature adoption?

Highlight: Overall acceptance and integration of the platform into the development workflow.

System metrics:

New user registrations: Monitor the growth rate of new developers using the platform.
Track time between registration and time to use the platform i.e., executing golden paths, tooling and portal functionality.
Number of active users per week / month / quarter / half-year / year who authenticate via the portal and/or use golden paths, tooling and portal functionality
Feature adoption: Track how quickly and widely new features or updates are used.
Percentage of developers using CI/CD through the platform
Number of deployments per user / team / day / week / month — basically of your choosing
Training: Evaluate changes in adoption, after delivering training.

Watch out for: Overlooking the "long tail" of adoption. A platform might see a burst of early adoption, but then plateau or even decline if it fails to continuously evolve and meet changing developer needs. Don't just measure initial adoption, monitor how usage evolves over weeks, months, and years.

Retention: Are developers loyal to the platform?

Highlight: Long-term engagement and reducing churn.

Subjective metrics:

Use an exit survey if a user is dormant for 12 or more months.

System metrics:

Churn rate: Track the percentage of developers who stop logging into the portal and are not using it.
Dormant users: Identify developers who become inactive after 6 months and investigate why.
Track services that are less frequently used.

Watch out for: Misinterpreting the reasons for churn. When developers stop using your platform (churn), it's crucial to understand why. Incorrectly identifying the cause can lead to wasted effort and missed opportunities for improvement. Consider factors outside the platform — churn could be caused by changes in project requirements, team structures or industry trends.

Task success: Can developers complete specific tasks?

Highlight: Efficiency and effectiveness of the platform in supporting specific developer activities.

Subjective metrics:

Survey to assess the ongoing presence of toil and its inimical influence on developer productivity, ultimately hindering efficiency and leading to increased task completion times.

System metrics:

Completion rates: Measure the percentage of golden paths and tools successfully run on the platform without errors.
Time to complete tasks using golden paths, portal, or tooling.
Error rates: Track common errors and failures developers encounter from log files or monitoring dashboards from golden paths, portal or tooling.
Mean Time to Resolution (MTTR): When errors do occur, how long does it take to resolve them? A lower MTTR indicates a more resilient platform and faster recovery from failures.
Developer platform and portal uptime: Measure the percentage of time that the developer platform and portal is available and operational. Higher uptime ensures developers can consistently access the platform and complete their tasks.

Watch out for: Don't confuse task success with task completion. Simply measuring whether developers can complete tasks on the platform doesn't necessarily indicate true success. Developers might find workarounds or complete tasks inefficiently, even if they technically achieve the end goal. It may be worth manually observing developer workflows in their natural environment to identify pain points and areas of friction in their workflows.

Also, be careful with misaligning task success with business goals. Task completion might overlook the broader impact on business objectives. A platform might enable developers to complete tasks efficiently, but if those tasks don't contribute to overall business goals, the platform's true value is questionable.

Applying the HEART framework to platform engineering

It’s not necessary to use all of the categories each time. The number of categories to consider really depends on the specific goals and context of the assessment; you can include everything or trim it down to better match your objective. Here are some examples:

Improving onboarding for new developers: Focus on adoption, task success and happiness.
Launching a new feature: Concentrate on adoption and happiness.
Increasing platform usage: Track engagement, retention and task success.

Keep in mind that relying on just one category will likely provide an incomplete picture.

When should you use the framework?

In a perfect world, you would use the HEART framework to establish a baseline assessment a few months after launching your platform, which will provide you with a valuable insight into early developer experience. As your platform evolves, this initial data becomes a benchmark for measuring progress and identifying trends. Early measurement allows you to proactively address UX issues, guide design decisions with data, and iterate quickly for optimal functionality and developer satisfaction. If you're starting with an MVP, conduct the baseline assessment once the core functionality is in place and you have a small group of early users to provide feedback.

After 12 or more months of usage, you can also add metrics to embody a new or more mature platform. This can help you gather deeper insights into your developers’ experience by understanding how they are using the platform, measure the impact of changes you’ve made to the platform, or identify areas for improvement and prioritize future development efforts. If you've added new golden paths, tooling, or enhanced functionality, then you'll need to track metrics that measure their success and impact on developer behavior.

The frequency with which you assess HEART metrics depends on several factors, including:

The maturity of your platform: Newer platforms benefit from more frequent reviews (e.g. monthly or quarterly) to track progress and address early issues. As the platform matures, you can reduce the frequency of your HEART assessments (e.g., bi-annually or annually).
The rate of change: To ensure updates and changes have a positive impact, apply the HEART framework more frequently when your platform is undergoing a period of rapid evolution such as major platform updates, new portal features or new golden paths, or some change in user behavior. This allows you to closely monitor the effects of each change on key metrics.
The size and complexity of your platform: Larger and more complex platforms may require more frequent assessments to capture nuances and potential issues.
Your team's capacity: Running HEART assessments requires time and resources. Consider your team's bandwidth and adjust the frequency accordingly.

Schedule periodic deep dives (e.g. quarterly or bi-annually) using the HEART framework to gain a more in-depth understanding of your platform's performance and identify areas for improvement.

Taking more steps towards platform engineering

In this blog post, we’ve shown how the HEART framework can be applied to platform engineering to measure and improve the developer experience. We’ve explored the five key aspects of the framework — happiness, engagement, adoption, retention, and task success — and provided specific metrics for each and guidance on when to apply them.By applying these insights, platform engineering teams can create a more positive and productive environment for their developers, leading to greater success in their software development efforts.To learn more about platform engineering, check out some of our other articles: 5 myths about platform engineering: what it is and what it isn’t, Another five myths about platform engineering, and Laying the foundation for a career in platform engineering.

And finally, check out the DORA Report 2024, which now has a section on Platform Engineering.

The PlanetScale vectors public beta

2024-10-22T16:06:55.000Z

A new JSON data type for ClickHouse

2024-10-22T14:47:00.000Z

Amazon Aurora PostgreSQL and Amazon DynamoDB zero-ETL integrations with Amazon Redshift now generally available

2024-10-15T19:59:43.000Z

Today, I am excited to announce the general availability of Amazon Aurora PostgreSQL-Compatible Edition and Amazon DynamoDB zero-ETL integrations with Amazon Redshift. Zero-ETL integration seamlessly makes transactional or operational data available in Amazon Redshift, removing the need to build and manage complex data pipelines that perform extract, transform, and load (ETL) operations. It automates the replication of source data to Amazon Redshift, simultaneously updating source data for you to use in Amazon Redshift for analytics and machine learning (ML) capabilities to derive timely insights and respond effectively to critical, time-sensitive events.

Using these new zero-ETL integrations, you can run unified analytics on your data from different applications without having to build and manage different data pipelines to write data from multiple relational and non-relational data sources into a single data warehouse. In this post, I provide two step-by-step walkthroughs on how to get started with both Amazon Aurora PostgreSQL and Amazon DynamoDB zero-ETL integrations with Amazon Redshift.

To create a zero-ETL integration, you specify a source and Amazon Redshift as the target. The integration replicates data from the source to the target data warehouse, making it available in Amazon Redshift seamlessly, and monitors the pipeline’s health.

Let’s explore how these new integrations work. In this post, you will learn how to create zero-ETL integrations to replicate data from different source databases (Aurora PostgreSQL and DynamoDB) to the same Amazon Redshift cluster. You will also learn how to select multiple tables or databases from Aurora PostgreSQL source databases to replicate data to the same Amazon Redshift cluster. You will observe how zero-ETL integrations provide flexibility without the operational burden of building and managing multiple ETL pipelines.

Getting started with Aurora PostgreSQL zero-ETL integration with Amazon Redshift
Before creating a database, I create a custom cluster parameter group because Aurora PostgreSQL zero-ETL integration with Amazon Redshift requires specific values for the Aurora DB cluster parameters. In the Amazon RDS console, I go to Parameter groups in the navigation pane. I choose Create parameter group.

I enter custom-pg-aurora-postgres-zero-etl for Parameter group name and Description. I choose Aurora PostgreSQL for Engine type and aurora-postgresql16 for Parameter group family (zero-ETL integration works with PostgreSQL 16.4 or above versions). Finally, I choose DB Cluster Parameter Group for Type and choose Create.

Next, I edit the newly created cluster parameter group by choosing it on the Parameter groups page. I choose Actions and then choose Edit. I set the following cluster parameter settings:

rds.logical_replication=1
aurora.enhanced_logical_replication=1
aurora.logical_replication_backup=0
aurora.logical_replication_globaldb=0

I choose Save Changes.

Next, I create an Aurora PostgreSQL database. When creating the database, you can set the configurations according to your needs. Remember to choose Aurora PostgreSQL (compatible with PostgreSQL 16.4 or above) from Available versions and the custom cluster parameter group (custom-pg-aurora-postgres-zero-etl in this case) for DB cluster parameter group in the Additional configuration section.

After the database becomes available, I connect to the Aurora PostgreSQL cluster, create a database named books, create a table named book_catalog in the default schema for this database and insert sample data to use with zero-ETL integration.

To get started with zero-ETL integration, I use an existing Amazon Redshift data warehouse. To create and manage Amazon Redshift resources, visit the Amazon Redshift Getting Started Guide.

In the Amazon RDS console, I go to the Zero-ETL integrations tab in the navigation pane and choose Create zero-ETL integration. I enter postgres-redshift-zero-etl for Integration identifier and Amazon Aurora zero-ETL integration with Amazon Redshift for Integration description. I choose Next.

On the next page, I choose Browse RDS databases to select the source database. For the Data filtering options, I use database.schema.table pattern. I include my table called book_catalog in Aurora PostgreSQL books database. The * in filters will replicate all book_catalog tables in all schemas within books database. I choose Include as filter type and enter books.*.book_catalog into the Filter expression field. I choose Next.

On the next page, I choose Browse Redshift data warehouses and select the existing Amazon Redshift data warehouse as the target. I must specify authorized principals and integration source on the target to enable Amazon Aurora to replicate into the data warehouse and enable case sensitivity. Amazon RDS can complete these steps for me during setup, or I can configure them manually in Amazon Redshift. For this demo, I choose Fix it for me and choose Next.

After the case sensitivity parameter and the resource policy for data warehouse are fixed, I choose Next on the next Add tags and encryption page. After I review the configuration, I choose Create zero-ETL integration.

After the integration succeeded, I choose the integration name to check the details.

Now, I need to create a database from integration to finish setting up. I go to the Amazon Redshift console, choose Zero-ETL integrations in the navigation pane and select the Aurora PostgreSQL integration I just created. I choose Create database from integration.

I choose books as Source named database and I enter zeroetl_aurorapg as the Destination database name. I choose Create database.

After the database is created, I return to the Aurora PostgreSQL integration page. On this page, I choose Query data to connect to the Amazon Redshift data warehouse to observe if the data is replicated. When I run a select query in the zeroetl_aurorapg database, I see that the data in book_catalog table is replicated to Amazon Redshift successfully.

As I said in the beginning, you can select multiple tables or databases from the Aurora PostgreSQL source database to replicate the data to the same Amazon Redshift cluster. To add another database to the same zero-ETL integration, all I have to do is to add another filter to the Data filtering options in the form of database.schema.table, replacing the database part with the database name I want to replicate. For this demo, I will select multiple tables to be replicated to the same data warehouse. I create another table named publisher in the Aurora PostgreSQL cluster and insert sample data to it.

I edit the Data filtering options to include publisher table for replication. To do this, I go to the postgres-redshift-zero-etl details page and choose Modify. I append books.*.publisher using comma in the Filter expression field. I choose Continue. I review the changes and choose Save changes. I observe that the Filtered data tables section on the integration details page has now 2 tables included for replication.

When I switch to the Amazon Redshift Query editor and refresh the tables, I can see that the new publisher table and its records are replicated to the data warehouse.

Now that I completed the Aurora PostgreSQL zero-ETL integration with Amazon Redshift, let’s create a DynamoDB zero-ETL integration with the same data warehouse.

Getting started with DynamoDB zero-ETL integration with Amazon Redshift
In this part, I proceed to create an Amazon DynamoDB zero-ETL integration using an existing Amazon DynamoDB table named Book_Catalog. The table has 2 items in it:

I go to the Amazon Redshift console and choose Zero-ETL integrations in the navigation pane. Then, I choose the arrow next to the Create zero-ETL integration and choose Create DynamoDB integration. I enter dynamodb-redshift-zero-etl for Integration name and Amazon DynamoDB zero-ETL integration with Amazon Redshift for Description. I choose Next.

On the next page, I choose Browse DynamoDB tables and select the Book_Catalog table. I must specify a resource policy with authorized principals and integration sources, and enable point-in-time recovery (PITR) on the source table before I create an integration. Amazon DynamoDB can do it for me, or I can change the configuration manually. I choose Fix it for me to automatically apply the required resource policies for the integration and enable PITR on the DynamoDB table. I choose Next.

Then, I choose my existing Amazon Redshift Serverless data warehouse as the target and choose Next.

I choose Next again in the Add tags and encryption page and choose Create DynamoDB integration in the Review and create page.

Now, I need to create a database from integration to finish setting up just like I did with Aurora PostgreSQL zero-ETL integration. In the Amazon Redshift console, I choose the DynamoDB integration and I choose Create database from integration. In the popup screen, I enter zeroetl_dynamodb as the Destination database name and choose Create database.

After the database is created, I go to the Amazon Redshift Zero-ETL integrations page and choose the DynamoDB integration I created. On this page, I choose Query data to connect to the Amazon Redshift data warehouse to observe if the data from DynamoDB Book_Catalog table is replicated. When I run a select query in the zeroetl_dynamodb database, I see that the data is replicated to Amazon Redshift successfully. Note that the data from DynamoDB is replicated in SUPER datatype column and can be accessed using PartiQL sql.

I insert another entry to the DynamoDB Book_Catalog table.

When I switch to the Amazon Redshift Query editor and refresh the select query, I can see that the new record is replicated to the data warehouse.

Zero-ETL integrations between Aurora PostgreSQL and DynamoDB with Amazon Redshift help you unify data from multiple database clusters and unlock insights in your data warehouse. Amazon Redshift allows cross-database queries and materialized views based off the multiple tables, giving you the opportunity to consolidate and simplify your analytics assets, improve operational efficiency, and optimize cost. You no longer have to worry about setting up and managing complex ETL pipelines.

Now available
Aurora PostgreSQL zero-ETL integration with Amazon Redshift is now available in US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm) AWS Regions.

Amazon DynamoDB zero-ETL integration with Amazon Redshift is now available in all commercial, China and GovCloud AWS Regions.

For pricing information, visit the Amazon Aurora and Amazon DynamoDB pricing pages.

To get started with this feature, visit Working with Aurora zero-ETL integrations with Amazon Redshift and Amazon Redshift Zero-ETL integrations documentation.

— Esra

Logging Best Practices: An Engineer's Checklist

2024-10-15T14:15:21.000Z

Zero-latency SQLite storage in every Durable Object

2024-10-13T23:07:03.000Z

Why targeting an ICP brings 10x more customers than you expected

2024-10-13T07:52:56.000Z

The Road to OCIv2 Images: What's Wrong with Tar? (2019)

2024-10-12T17:25:53.000Z

Packaging an Elixir/Phoenix application with Nix

2024-10-11T19:54:26.000Z

Gleam: A Basic Introduction

2024-10-11T18:57:33.000Z

Nixiesearch: Running Lucene over S3, and why we're building a new search engine

2024-10-10T09:11:26.000Z

Improving platform resilience at Cloudflare through automation

2024-10-09T06:07:17.805Z

Failure is an expected state in production systems, and no predictable failure of either software or hardware components should result in a negative experience for users. The exact failure mode may vary, but certain remediation steps must be taken after detection. A common example is when an error occurs on a server, rendering it unfit for production workloads, and requiring action to recover.

When operating at Cloudflare’s scale, it is important to ensure that our platform is able to recover from faults seamlessly. It can be tempting to rely on the expertise of world-class engineers to remediate these faults, but this would be manual, repetitive, unlikely to produce enduring value, and not scaling. In one word: toil; not a viable solution at our scale and rate of growth.

In this post we discuss how we built the foundations to enable a more scalable future, and what problems it has immediately allowed us to solve.

Growing pains

The Cloudflare Site Reliability Engineering (SRE) team builds and manages the platform that helps product teams deliver our extensive suite of offerings to customers. One important component of this platform is the collection of servers that power critical products such as Durable Objects, Workers, and DDoS mitigation. We also build and maintain foundational software services that power our product offerings, such as configuration management, provisioning, and IP address allocation systems.

As part of tactical operations work, we are often required to respond to failures in any of these components to minimize impact to users. Impact can vary from lack of access to a specific product feature, to total unavailability. The level of response required is determined by the priority, which is usually a reflection of the severity of impact on users. Lower-priority failures are more common — a server may run too hot, or experience an unrecoverable hardware error. Higher-priority failures are rare and are typically resolved via a well-defined incident response process, requiring collaboration with multiple other teams.

The commonality of lower-priority failures makes it obvious when the response required, as defined in runbooks, is “toilsome”. To reduce this toil, we had previously implemented a plethora of solutions to automate runbook actions such as manually-invoked shell scripts, cron jobs, and ad-hoc software services. These had grown organically over time and provided solutions on a case-by-case basis, which led to duplication of work, tight coupling, and lack of context awareness across the solutions.

We also care about how long it takes to resolve any potential impact on users. A resolution process which involves the manual invocation of a script relies on human action, increasing the Mean-Time-To-Resolve (MTTR) and leaving room for human error. This risks increasing the amount of errors we serve to users and degrading trust.

These problems proved that we needed a way to automatically heal these platform components. This especially applies to our servers, for which failure can cause impact across multiple product offerings. While we have mechanisms to automatically steer traffic away from these degraded servers, in some rare cases the breakage is sudden enough to be visible.

Solving the problem

To provide a more reliable platform, we needed a new component that provides a common ground for remediation efforts. This would remove duplication of work, provide unified context-awareness and increase development speed, which ultimately saves hours of engineering time and effort.

A good solution would not allow only the SRE team to auto-remediate, it would empower the entire company. The key to adding self-healing capability was a generic interface for all teams to self-service and quickly remediate failures at various levels: machine, service, network, or dependencies.

A good way to think about auto-remediation is in terms of workflows. A workflow is a sequence of steps to get to a desired outcome. This is not dissimilar to a manual shell script which executes what a human would otherwise do via runbook instructions. Because of this logical fit with workflows, we decided to adopt Temporal.

Temporal is a durable execution platform which is useful to gracefully manage infrastructure failures such as network outages and transient failures in external service endpoints. This capability meant we only needed to build a way to schedule “workflow” tasks and have Temporal provide reliability guarantees. This allowed us to focus on building out the orchestration system to support the control and flow of workflow execution in our data centers.

Temporal’s documentation provides a good introduction to writing Temporal workflows.

Building an Automatic Remediation System

Below, we describe how our automatic remediation system works. It is essentially a way to schedule tasks across our global network with built-in reliability guarantees. With this system, teams can serve their customers more reliably. An unexpected failure mode can be recognized and immediately mitigated, while the root cause can be determined later via a more detailed analysis.

Step one: we need a coordinator

After our initial testing of Temporal, it was now possible to write workflows. But we needed a way to schedule workflow tasks from other internal services. The coordinator was built to serve this purpose, and became the primary mechanism for the authorisation and scheduling of workflows.

The most important roles of the coordinator are authorisation, workflow task routing, and safety constraints enforcement. Each consumer is authorized via mTLS authentication, and the coordinator uses an ACL to determine whether to permit the execution of a workflow. An ACL configuration looks like the following example.

server_config {
    enable_tls = true
    [...]
    route_rule {
      name  = "global_get"
      method = "GET"
      route_patterns = ["/*"]
      uris = ["spiffe://example.com/worker-admin"]
    }
    route_rule {
      name = "global_post"
      method = "POST"
      route_patterns = ["/*"]
      uris = ["spiffe://example.com/worker-admin"]
      allow_public = true
    }
    route_rule {
      name = "public_access"
      method = "GET"
      route_patterns = ["/metrics"]
      uris = []
      allow_public = true
      skip_log_match = true
    }
}

Each workflow specifies two key characteristics: where to run the tasks and the safety constraints, using an HCL configuration file. Example constraints could be whether to run on only a specific node type (such as a database), or if multiple parallel executions are allowed: if a task has been triggered too many times, that is a sign of a wider problem that might require human intervention. The coordinator uses the Temporal Visibility API to determine the current state of the executions in the Temporal cluster.

An example of a configuration file is shown below:

task_queue_target = ""

# The following entries will ensure that
# 1. This workflow is not run at the same time in a 15m window.
# 2. This workflow will not run more than once an hour.
# 3. This workflow will not run more than 3 times in one day.
#
constraint {
    kind = "concurency"
    value = "1"
    period = "15m"
}

constraint {
    kind = "maxExecution"
    value = "1"
    period = "1h"
}

constraint {
    kind = "maxExecution"
    value = "3"
    period = "24h"
    is_global = true
}

Step two: Task Routing is amazing

An unforeseen benefit of using a central Temporal cluster was the discovery of Task Routing. This feature allows us to schedule a Workflow/Activity on any server that has a running Temporal Worker, and further segment by the type of server, its location, etc. For this reason, we have three primary task queues — the general queue in which tasks can be executed by any worker in the datacenter, the node type queue in which tasks can only be executed by a specific node type in the datacenter, and the individual node queue where we target a specific node for task execution.

We rely on this heavily to ensure the speed and efficiency of automated remediation. Certain tasks can be run in datacenters with known low latency to an external resource, or a node type with better performance than others (due to differences in the underlying hardware). This reduces the amount of failure and latency we see overall in task executions. Sometimes we are also constrained by certain types of tasks that can only run on a certain node type, such as a database.

Task Routing also means that we can configure certain task queues to have a higher priority for execution, although this is not a feature we have needed so far. A drawback of task routing is that every Workflow/Activity needs to be registered to the target task queue, which is a common gotcha. Thankfully, it is possible to catch this failure condition with proper testing.

Step three: when/how to self-heal?

None of this would be relevant if we didn’t put it to good use. A primary design goal for the platform was to ensure we had easy, quick ways to trigger workflows on the most important failure conditions. The next step was to determine what the best sources to trigger the actions were. The answer to this was simple: we could trigger workflows from anywhere as long as they are properly authorized and detect the failure conditions accurately.

Example triggers are an alerting system, a log tailer, a health check daemon, or an authorized engineer via a chatbot. Such flexibility allows a high level of reuse, and permits to invest more in workflow quality and reliability.

As part of the solution, we built a daemon that is able to poll a signal source for any unwanted condition and trigger a configured workflow. We have initially found Prometheus useful as a source because it contains both service-level and hardware/system-level metrics. We are also exploring more event-based trigger mechanisms, which could eliminate the need to use precious system resources to poll for metrics.

We already had internal services that are able to detect widespread failure conditions for our customers, but were only able to page a human. With the adoption of auto-remediation, these systems are now able to react automatically. This ability to create an automatic feedback loop with our customers is the cornerstone of these self-healing capabilities and we continue to work on stronger signals, faster reaction times, and better prevention of future occurrences.

The most exciting part, however, is the future possibility. Every customer cares about any negative impact from Cloudflare. With this platform we can onboard several services (especially those that are foundational for the critical path) and ensure we react quickly to any failure conditions, even before there is any visible impact.

Step four: packaging and deployment

The whole system is written in golang, and a single binary can implement each role. We distribute it as an apt package or a container for maximum ease of deployment.

We deploy a Temporal-based worker to every server we intend to run tasks on, and a daemon in datacenters where we intend to automatically trigger workflows based on the local conditions. The coordinator is more nuanced since we rely on task routing and can trigger from a central coordinator, but we have also found value in running coordinators locally in the datacenters. This is especially useful in datacenters with less capacity or degraded performance, removing the need for a round-trip to schedule the workflows.

Step five: test, test, test

Temporal provides native mechanisms to test an entire workflow, via a comprehensive test suite that supports end-to-end, integration, and unit testing, which we used extensively to prevent regressions while developing. We also ensured proper test coverage for all the critical platform components, especially the coordinator.

Despite the ease of written tests, we quickly discovered that they were not enough. After writing workflows, engineers need an environment as close as possible to the target conditions. This is why we configured our staging environments to support quick and efficient testing. These environments receive the latest changes and point to a different (staging) Temporal cluster, which enables experimentation and easy validation of changes.

After a workflow is validated in the staging environment, we can then do a full release to production. It seems obvious, but catching simple configuration errors before releasing has saved us many hours in development/change-related-task time.

Deploying to production

As you can guess from the title of this post, we put this in production to automatically react to server-specific errors and unrecoverable failures. To this end, we have a set of services that are able to detect single-server failure conditions based on analyzed traffic data. After deployment, we have successfully mitigated potential impact by taking any errant single sources of failure out of production.

We have also created a set of workflows to reduce internal toil and improve efficiency. These workflows can automatically test pull requests on target machines, wipe and reset servers after experiments are concluded, and take away manual processes that cost many hours in toil.

Building a system that is maintained by several SRE teams has allowed us to iterate faster, and rapidly tackle long-standing problems. We have set ambitious goals regarding toil elimination and are on course to achieve them, which will allow us to scale faster by eliminating the human bottleneck.

Looking to the future

Our immediate plans are to leverage this system to provide a more reliable platform for our customers and drastically reduce operational toil, freeing up engineering resources to tackle larger-scale problems. We also intend to leverage more Temporal features such as Workflow Versioning, which will simplify the process of making changes to workflows by ensuring that triggered workflows run expected versions.

We are also interested in how others are solving problems using durable execution platforms such as Temporal, and general strategies to eliminate toil. If you would like to discuss this further, feel free to reach out on the Cloudflare Community and start a conversation!

If you’re interested in contributing to projects that help build a better Internet, our engineering teams are hiring.

Unlocking the 'aha' moment: Developer relations for startups

2024-10-07T16:40:30.000Z

Gleam Is Pragmatic

2024-10-06T18:15:23.000Z

Good Retry, Bad Retry: An Incident Story

2024-10-06T08:56:08.000Z

Flawless is now in public beta

2024-10-03T21:09:21.000Z

Show HN: Kameo – a Rust library for building fault-tolerant, async actors

2024-10-02T18:22:10.000Z

Show HN: Kameo – Fault-tolerant async actors built on Tokio

2024-10-02T18:22:10.000Z

Distributed Transactions in Go: Read Before You Try

2024-10-02T14:10:07.000Z

Distributed transactions in Go: Read before you try

2024-10-02T14:10:07.000Z

Alert Evaluations: Incremental Merges in ClickHouse

2024-10-02T12:55:48.000Z

Automattic–WP Engine Term Sheet

2024-10-02T08:41:46.000Z

Bop Spotter

2024-09-30T06:09:53.000Z

The Ultimate Oldschool PC Font Pack

2024-09-30T02:26:46.000Z

On With Theo / T3.gg

2024-09-29T21:15:24.000Z

Some Go web dev notes

2024-09-29T14:36:23.000Z

How Discord stores trillions of messages

2024-09-28T22:07:43.000Z

WP Engine Reprieve

2024-09-27T21:23:44.000Z

Dev Starter Pack: The essential startup starter kit

2024-09-27T13:06:02.000Z

Our container platform is in production. It has GPUs. Here's an early look

2024-09-27T13:05:53.000Z

Some Go web dev notes

2024-09-27T11:16:00.000Z

I spent a lot of time in the past couple of weeks working on a website in Go that may or may not ever see the light of day, but I learned a couple of things along the way I wanted to write down. Here they are:

go 1.22 now has better routing

I’ve never felt motivated to learn any of the Go routing libraries (gorilla/mux, chi, etc), so I’ve been doing all my routing by hand, like this.

	// DELETE /records:
	case r.Method == "DELETE" && n == 1 && p[0] == "records":
		if !requireLogin(username, r.URL.Path, r, w) {
			return
		}
		deleteAllRecords(ctx, username, rs, w, r)
	// POST /records/
	case r.Method == "POST" && n == 2 && p[0] == "records" && len(p[1]) > 0:
		if !requireLogin(username, r.URL.Path, r, w) {
			return
		}
		updateRecord(ctx, username, p[1], rs, w, r)

But apparently as of Go 1.22, Go now has better support for routing in the standard library, so that code can be rewritten something like this:

	mux.HandleFunc("DELETE /records/", app.deleteAllRecords)
	mux.HandleFunc("POST /records/{record_id}", app.updateRecord)

Though it would also need a login middleware, so maybe something more like this, with a requireLogin middleware.

	mux.Handle("DELETE /records/", requireLogin(http.HandlerFunc(app.deleteAllRecords)))

a gotcha with the built-in router: redirects with trailing slashes

One annoying gotcha I ran into was: if I make a route for /records/, then a request for /records will be redirected to /records/.

I ran into an issue with this where sending a POST request to /records redirected to a GET request for /records/, which broke the POST request because it removed the request body. Thankfully Xe Iaso wrote a blog post about the exact same issue which made it easier to debug.

I think the solution to this is just to use API endpoints like POST /records instead of POST /records/, which seems like a more normal design anyway.

sqlc automatically generates code for my db queries

I got a little bit tired of writing so much boilerplate for my SQL queries, but I didn’t really feel like learning an ORM, because I know what SQL queries I want to write, and I didn’t feel like learning the ORM’s conventions for translating things into SQL queries.

But then I found sqlc, which will compile a query like this:


-- name: GetVariant :one
SELECT *
FROM variants
WHERE id = ?;

into Go code like this:

const getVariant = `-- name: GetVariant :one
SELECT id, created_at, updated_at, disabled, product_name, variant_name
FROM variants
WHERE id = ?
`

func (q *Queries) GetVariant(ctx context.Context, id int64) (Variant, error) {
	row := q.db.QueryRowContext(ctx, getVariant, id)
	var i Variant
	err := row.Scan(
		&i.ID,
		&i.CreatedAt,
		&i.UpdatedAt,
		&i.Disabled,
		&i.ProductName,
		&i.VariantName,
	)
	return i, err
}

What I like about this is that if I’m ever unsure about what Go code to write for a given SQL query, I can just write the query I want, read the generated function and it’ll tell me exactly what to do to call it. It feels much easier to me than trying to dig through the ORM’s documentation to figure out how to construct the SQL query I want.

Reading Brandur’s sqlc notes from 2024 also gave me some confidence that this is a workable path for my tiny programs. That post gives a really helpful example of how to conditionally update fields in a table using CASE statements (for example if you have a table with 20 columns and you only want to update 3 of them).

sqlite tips

Someone on Mastodon linked me to this post called Optimizing sqlite for servers. My projects are small and I’m not so concerned about performance, but my main takeaways were:

have a dedicated object for writing to the database, and run db.SetMaxOpenConns(1) on it. I learned the hard way that if I don’t do this then I’ll get SQLITE_BUSY errors from two threads trying to write to the db at the same time.
if I want to make reads faster, I could have 2 separate db objects, one for writing and one for reading

There are a more tips in that post that seem useful (like “COUNT queries are slow” and “Use STRICT tables”), but I haven’t done those yet.

Also sometimes if I have two tables where I know I’ll never need to do a JOIN beteween them, I’ll just put them in separate databases so that I can connect to them independently.

Go 1.19 introduced a way to set a GC memory limit

I run all of my Go projects in VMs with relatively little memory, like 256MB or 512MB. I ran into an issue where my application kept getting OOM killed and it was confusing – did I have a memory leak? What?

After some Googling, I realized that maybe I didn’t have a memory leak, maybe I just needed to reconfigure the garbage collector! It turns out that by default (according to A Guide to the Go Garbage Collector), Go’s garbage collector will let the application allocate memory up to 2x the current heap size.

Mess With DNS’s base heap size is around 170MB and the amount of memory free on the VM is around 160MB right now, so if its memory doubled, it’ll get OOM killed.

In Go 1.19, they added a way to tell Go “hey, if the application starts using this much memory, run a GC”. So I set the GC memory limit to 250MB and it seems to have resulted in the application getting OOM killed less often:

export GOMEMLIMIT=250MiB

some reasons I like making websites in Go

I’ve been making tiny websites (like the nginx playground) in Go on and off for the last 4 years or so and it’s really been working for me. I think I like it because:

there’s just 1 static binary, all I need to do to deploy it is copy the binary. If there are static files I can just embed them in the binary with embed.
there’s a built-in webserver that’s okay to use in production, so I don’t need to configure WSGI or whatever to get it to work. I can just put it behind Caddy or run it on fly.io or whatever.
Go’s toolchain is very easy to install, I can just do apt-get install golang-go or whatever and then a go build will build my project
it feels like there’s very little to remember to start sending HTTP responses – basically all there is are functions like Serve(w http.ResponseWriter, r *http.Request) which read the request and send a response. If I need to remember some detail of how exactly that’s accomplished, I just have to read the function!
also net/http is in the standard library, so you can start making websites without installing any libraries at all. I really appreciate this one.
Go is a pretty systems-y language, so if I need to run an ioctl or something that’s easy to do

In general everything about it feels like it makes projects easy to work on for 5 days, abandon for 2 years, and then get back into writing code without a lot of problems.

For contrast, I’ve tried to learn Rails a couple of times and I really want to love Rails – I’ve made a couple of toy websites in Rails and it’s always felt like a really magical experience. But ultimately when I come back to those projects I can’t remember how anything works and I just end up giving up. It feels easier to me to come back to my Go projects that are full of a lot of repetitive boilerplate, because at least I can read the code and figure out how it works.

things I haven’t figured out yet

some things I haven’t done much of yet in Go:

rendering HTML templates: usually my Go servers are just APIs and I make the frontend a single-page app with Vue. I’ve used html/template a lot in Hugo (which I’ve used for this blog for the last 8 years) but I’m still not sure how I feel about it.
I’ve never made a real login system, usually my servers don’t have users at all.
I’ve never tried to implement CSRF

In general I’m not sure how to implement security-sensitive features so I don’t start projects which need login/CSRF/etc. I imagine this is where a framework would help.

it’s cool to see the new features Go has been adding

Both of the Go features I mentioned in this post (GOMEMLIMIT and the routing) are new in the last couple of years and I didn’t notice when they came out. It makes me think I should pay closer attention to the release notes for new Go versions.

AWS Nitro Enclaves: Attack Surface

2024-09-26T07:12:04.000Z

Deep Dive into Postgres Write-Ahead Logs

2024-09-24T23:32:36.000Z

Show HN: Oodle – serverless, fully-managed, drop-in replacement for Prometheus

2024-09-24T12:39:39.000Z

Make your Next.JS Docker images microscopic

2024-09-23T10:45:24.000Z

Does “building in public” work?

2024-09-22T17:30:55.000Z

pgroll: PostgreSQL zero-downtime migrations made easy

2024-09-22T12:09:29.000Z

WP Engine is not WordPress

2024-09-22T00:16:25.000Z

Ultra high-resolution image of The Night Watch

2024-09-21T09:08:20.000Z

Discord Reduced WebSocket Traffic by 40%

2024-09-20T18:09:15.000Z

Show HN: Inngest 1.0 – Open-source durable workflows on every platform

2024-09-20T17:33:15.000Z

Anthropic – Introducing Contextual Retrieval

2024-09-20T01:57:22.000Z

Yes, you can have exactly-once delivery

2024-09-20T01:33:17.000Z

Is there a flight search engine that combines flights from different airlines?

2024-09-20T01:20:32.000Z

Comic Mono

2024-09-18T20:36:28.000Z

Swift 6

2024-09-17T19:20:58.000Z

OpenTelemetry Tracing in < 200 lines of code

2024-09-17T17:21:41.000Z

IBM acquires Kubecost

2024-09-17T13:47:22.000Z

Constraints and Guarantees

2024-09-16T08:26:14.000Z

Mr Beast YouTube Playbook Leaked

2024-09-16T00:06:04.000Z

Atkinson Hyperlegible Font

2024-09-15T20:42:03.000Z

Product Psychology

2024-09-13T13:22:17.000Z

Does your startup need complex cloud infrastructure?

2024-09-13T02:29:51.000Z

Remix's concurrent submissions are fundamentally flawed

2024-09-12T12:47:31.000Z

Going open-source as a VC-Backed company

2024-09-10T13:16:28.000Z

Why GitHub Won

2024-09-09T16:27:29.000Z

Why GitHub won

2024-09-09T16:27:29.000Z

Show HN: Goroutine Monitor Powered by eBPF

2024-09-08T10:19:46.000Z

Serverless-registry: A Docker registry backed by Workers and R2

2024-09-05T16:34:51.000Z

A Pipeline Made of Airbags

2024-09-05T14:11:48.000Z

Cloud of Disillusion: The Broken Promise of PaaS

2024-09-04T15:26:39.000Z

Production-ready Docker Containers with uv

2024-09-04T13:27:46.000Z

Departure Mono

2024-09-02T17:41:06.000Z

Notes on Distributed Systems for Young Bloods

2024-09-02T17:23:26.000Z

Show HN: OBS Live-streaming with 120ms latency

2024-09-02T12:46:47.000Z

OrbStack: The fast, light, and easy way to run Docker containers and Linux

2024-09-02T01:35:56.000Z

Extreme Pi Boot Optimization

2024-09-01T21:36:55.000Z

Shine with Gleam

2024-08-31T08:42:42.000Z

RunCVM: An open-source Docker runtime for launching container images in VMs

2024-08-28T22:38:13.000Z

The Monospace Web

2024-08-27T17:09:56.000Z

Marketing to Engineers (2001)

2024-08-27T15:23:04.000Z

Leader Election with S3 Conditional Writes

2024-08-26T13:54:08.000Z

Predicting the future of distributed systems

2024-08-26T11:56:19.000Z

Coolify’s rise to fame, and why it could be a big deal

2024-08-26T11:50:20.000Z

13 Years of Building Infrastructure Control Planes in Ruby

2024-08-26T10:48:50.000Z

Knockknock: Simple, secure, and stealthy port knocking implementation

2024-08-24T11:49:53.000Z

Show HN: Ruroco – like port knocking, but better

2024-08-23T10:19:07.000Z

PUBG developer Krafton make a life simulation game inzoi

2024-08-21T01:10:29.000Z

Launch YC: Ares Industries – Building low-cost cruise missiles

2024-08-21T01:04:06.000Z

Zen, a Arc-like open-source browser based on the Firefox engine

2024-08-20T20:57:04.000Z

Systems Distributed '24

2024-08-17T08:33:02.000Z

Black Mesa

2024-08-15T22:07:25.000Z

Project Oak: Meaningful control of data in distributed systems

2024-08-14T14:00:55.000Z

Faster Docker builds using a remote BuildKit instance

2024-08-13T00:11:04.000Z

Building static binaries in Nix

2024-08-11T19:35:54.000Z

Server Mono: A Typeface Inspired by Typewriters, Apple's SF Mono, and CLIs

2024-08-11T16:04:49.000Z

Number of incidents affecting GitHub, Bitbucket, Gitlab and Jira is rising

2024-08-09T08:48:50.000Z

SaaS Copywriting: Marketing SaaS Framework

2024-08-09T01:49:22.000Z

Queues invert control flow but require flow control

2024-08-08T20:22:52.000Z

Cloudflare Introduces Automatic SSL/TLS

2024-08-08T16:44:18.000Z

Jepsen: Jetcd 0.8.2

2024-08-08T14:09:14.000Z

Launch HN: Release (YC W20) – Orchestrate AI Infrastructure and Applications

2024-08-07T14:50:11.000Z

Ask HN: How to Price a Product

2024-08-07T12:02:27.000Z

Cringey, but True: How Uber Tests Payments in Production

2024-08-07T07:16:02.000Z

First impressions of Gleam: lots of joys and some rough edges

2024-08-07T00:00:31.000Z

R5N - Obfuscated mesh routing on hostile networks.

2024-08-03T04:30:27.000Z

ClickHouse acquires PeerDB to expand its Postgres support

2024-08-02T15:57:33.000Z

Our Audit of Homebrew

2024-07-30T22:39:21.000Z

Show HN: Trayce – Network tab for your local Docker containers

2024-07-30T18:58:05.000Z

Running One-man SaaS for 9 Years

2024-07-29T22:15:53.000Z

One-man SaaS, 9 Years In

2024-07-29T22:15:53.000Z

Show HN: Trayce – Network tab for Docker containers

2024-07-29T19:23:57.000Z

Is Cloudflare overcharging us for their images service?

2024-07-29T14:55:57.000Z

Why is spawning a new process in Node so slow?

2024-07-26T23:07:24.000Z

Unfashionably secure: why we use isolated VMs

2024-07-25T17:00:03.000Z

Don't Let Architecture Astronauts Scare You

2024-07-25T13:30:22.000Z

Replay.io is discontinuing Replay Test Suites

2024-07-25T08:19:57.000Z

The Rich History of Ham Radio Culture

2024-07-24T19:05:11.000Z

Enhancing Your Elixir Codebase with Gleam

2024-07-23T23:27:32.000Z

Syscall.sh

2024-07-21T07:13:39.000Z

Google Now Defaults to Not Indexing Your Content

2024-07-17T20:04:53.000Z

Gitlab Explores Sale

2024-07-17T07:57:26.000Z

Show HN: Pippy – Pipelines for GitHub Actions

2024-07-17T07:17:23.000Z

Introducing multi-version schema migrations for Postgres

2024-07-16T17:01:51.000Z

Create Unified Kernel Image from Scratch

2024-07-15T17:59:08.000Z

Rust for Filesystems

2024-07-15T09:39:17.000Z

We saved $5k a month with a single Grafana query

2024-07-12T11:57:20.000Z

Using S3 as a Container Registry

2024-07-12T04:26:23.000Z

AWS Secrets Manager Agent

2024-07-11T23:09:01.000Z

The economics of a Postgres free tier

2024-07-11T17:43:58.000Z

Capturing Linux SSL/TLS plaintext without a CA certificate using eBPF

2024-07-11T17:31:48.000Z

How to Survive 3 Years in North Korea as a Foreigner

2024-07-11T15:52:14.000Z

The Typeset of Wall·E

2024-07-11T09:28:54.000Z

Changes to Stripe Billing

2024-07-10T21:07:25.000Z

Turbopuffer: Fast search on object storage

2024-07-09T14:48:59.000Z

Gleam v1.3.0 – Auto-imports and tolerant expressions

2024-07-09T13:35:25.000Z

Rye: A Hassle-Free Python Experience

2024-07-09T01:18:15.000Z

How we tamed Node.js event loop lag: a deepdive

2024-07-08T20:52:01.000Z

Synchronization Is Bad for Scale

2024-07-07T17:03:16.000Z

DevOps Isn't Dead, but It's Not in Great Health Either

2024-07-06T06:22:03.000Z

Developing Inside a Container

2024-07-03T10:09:05.000Z

Why AI Infrastructure Startups Are Insanely Hard to Build

2024-07-03T03:15:58.000Z

It's not just you, Next.js is getting harder to use

2024-06-29T08:00:56.000Z

DevOps: The Funeral

2024-06-29T00:54:22.000Z

A Eulogy for DevOps

2024-06-28T22:59:05.000Z

Adding parallel evaluation to Nix

2024-06-27T15:19:16.000Z

gRPC: The Bad Parts

2024-06-26T11:30:04.000Z

OpenAI Acquires Multi

2024-06-24T15:32:30.000Z

CRIU, a project to implement checkpoint/restore functionality for Linux

2024-06-21T16:59:52.000Z

Cloudflare Connectivity Issues in Eastern US and Central Europe

2024-06-20T18:31:33.000Z

Making Serverless Orchestration 25x Faster

2024-06-17T18:24:25.000Z

8 days downtime: Cloudflare r2 subscription bug ruins my business

2024-06-16T19:30:58.000Z

Brew-Nix: a flake automatically packaging all homebrew casks

2024-06-14T02:16:46.000Z

Show HN: Restate – Low-latency durable workflows for JavaScript/Java, in Rust

2024-06-12T15:25:23.000Z

Elixir 1.17 released: set-theoretic types in patterns, durations, OTP 27

2024-06-12T11:13:52.000Z

A new map of medieval London

2024-06-12T07:54:37.000Z

Exploring Gleam, a type-safe language on the BEAM

2024-06-11T06:29:49.000Z

microVM infrastructure using low-latency memory decompression

2024-06-08T00:09:15.000Z

Show HN: OSS Auth0 Alternative Ory Kratos Now with Full PassKey Support

2024-06-06T12:38:58.000Z

Memories of an Enron Summer

2024-06-05T21:01:57.000Z

Feature flags in Bazel builds

2024-06-05T18:26:02.000Z

We Improved the Performance of a Userspace TCP Stack in Go by 5X

2024-06-05T16:13:39.000Z

We improved the performance of a userspace TCP stack in Go

2024-06-05T16:13:39.000Z

In defence of swap: common misconceptions

2024-06-05T06:42:42.000Z

GitHub now provides Arm-based runners

2024-06-05T05:02:09.000Z

Arm64 on GitHub Actions

2024-06-05T05:02:09.000Z

Ask HN: How to bring traffic to my product

2024-06-04T12:34:51.000Z

Show HN: Brioche – A new Nix-like package manager

2024-06-03T16:00:04.000Z

DuckDB 1.0.0

2024-06-03T13:18:18.000Z

ht: Headless Terminal

2024-06-02T07:53:14.000Z

Multi-Tenant Authorization with Zenstack

2024-06-01T12:19:06.000Z

Orion – From idea to launch in 45 days

2024-05-31T08:03:04.000Z

Zig's new CLI progress bar explained

2024-05-30T04:07:11.000Z

Caddy 2.8

2024-05-29T22:04:24.000Z

Kino: Pro Video Camera

2024-05-29T17:06:40.000Z

Ask HN: Can a website kill my internet connection? (WebRTC)

2024-05-29T11:09:14.000Z

Cloudflare has become a pig butchering scam but legal [video]

2024-05-29T08:45:36.000Z

CedarDB: German-Powered, PostgreSQL-Compatiable Freak of Nature Database System

2024-05-28T16:07:06.000Z

How we enabled ARM64 VMs

2024-05-27T15:02:19.000Z

Rootless Docker in a multi-user environment

2024-05-25T19:11:40.000Z

Show HN: Porter Cloud – PaaS with an eject button

2024-05-23T16:47:00.000Z

Making EC2 boot time faster

2024-05-23T14:31:36.000Z

Magic UI: UI Library for Design Engineers

2024-05-23T03:23:54.000Z

S3 is showing its age

2024-05-22T18:23:17.000Z

Stripe's monorepo developer environment

2024-05-21T19:42:33.000Z

State of Sandboxing in Linux

2024-05-21T16:16:50.000Z

Backblaze Scales Storage Cloud

2024-05-21T16:14:06.000Z

Ask HN: Why do you all think that Htmx is such a recent development?

2024-05-20T19:00:12.000Z

Ask HN: Video streaming is expensive yet YouTube "seems" to do it for free. How?

2024-05-19T17:51:41.000Z

Ask HN: SaaS Subscription or Usage-Based Pricing?

2024-05-16T10:35:45.000Z

Jepsen: Datomic Pro 1.0.7075

2024-05-15T16:57:30.000Z

Mike Krieger: I've joined AnthropicAI as their Chief Product Officer

2024-05-15T16:03:24.000Z

Show HN: Open-source BI and analytics for engineers

2024-05-15T14:02:35.000Z

The new APT 3.0 solver

2024-05-14T18:40:50.000Z

Optimizing ClickHouse: Tactics that worked for us

2024-05-14T14:57:55.000Z

My VM is lighter (and safer) than your container

2024-05-14T11:24:02.000Z

Add sysctl to disable Nagle's algorithm (RFC 896 – Congestion Control)

2024-05-14T07:51:04.000Z

Using ARG in a Dockerfile – beware the gotcha

2024-05-14T07:05:38.000Z

Avoiding the soft delete anti-pattern

2024-05-11T08:26:58.000Z

Flatcar: OS Innovation with Systemd-Sysext

2024-05-11T06:27:41.000Z

Show HN: A web debugger an ex-Cloudflare team has been working on for 4 years

2024-05-10T13:08:38.000Z

Protobuf Editions are here: don't panic

2024-05-09T20:58:45.000Z

It's always TCP_NODELAY

2024-05-09T17:54:49.000Z

Show HN: Ellipsis – Automated PR reviews and bug fixes

2024-05-09T16:14:47.000Z

It's not about “Flakes vs. Channels”

2024-05-08T23:20:34.000Z

Modern SQLite: Generated columns

2024-05-08T13:53:24.000Z

Encore: Distributed systems runtime for TypeScript, written in Rust

2024-05-08T13:32:35.000Z

Inclusive Sans, a text font designed for accessibility and readability

2024-05-08T12:54:26.000Z

Fly.io Infra log: week-by-week record of what the team does

2024-05-08T06:27:59.000Z

Show HN: Peerdb Streams – Simple, native Postgres change data capture

2024-05-06T17:00:42.000Z

A fourteen-day free trial ain’t gonna cut it

2024-05-06T13:50:10.000Z

Lix is a modern, delicious implementation of the Nix package manager

2024-05-06T04:55:14.000Z

Take a look at Traefik, even if you don't use containers

2024-05-05T11:31:17.000Z

Building Containers from Scratch: Layers

2024-05-05T06:20:35.000Z

OpenAI Bought Chatgpt.com

2024-05-04T17:36:21.000Z

Deno KV internals: building a database for the modern web

2024-05-04T17:01:51.000Z

Snowflake's tech is not worth $50B dollars

2024-05-03T22:30:31.000Z

Ask HN: How do people create those sleek looking demos for startups?

2024-05-02T01:31:30.000Z

New startup sells coffee through SSH

2024-05-01T18:26:33.000Z

Container Runtime Interface streaming explained

2024-05-01T00:00:00.000Z

The Kubernetes Container Runtime Interface (CRI) acts as the main connection between the kubelet and the Container Runtime. Those runtimes have to provide a gRPC server which has to fulfill a Kubernetes defined Protocol Buffer interface. This API definition evolves over time, for example when contributors add new features or fields are going to become deprecated.

In this blog post, I'd like to dive into the functionality and history of three extraordinary Remote Procedure Calls (RPCs), which are truly outstanding in terms of how they work: Exec, Attach and PortForward.

Exec can be used to run dedicated commands within the container and stream the output to a client like kubectl or crictl. It also allows interaction with that process using standard input (stdin), for example if users want to run a new shell instance within an existing workload.

Attach streams the output of the currently running process via standard I/O from the container to the client and also allows interaction with them. This is particularly useful if users want to see what is going on in the container and be able to interact with the process.

PortForward can be utilized to forward a port from the host to the container to be able to interact with it using third party network tools. This allows it to bypass Kubernetes services for a certain workload and interact with its network interface.

What is so special about them?

All RPCs of the CRI either use the gRPC unary calls for communication or the server side streaming feature (only GetContainerEvents right now). This means that mainly all RPCs retrieve a single client request and have to return a single server response. The same applies to Exec, Attach, and PortForward, where their protocol definition looks like this:

// Exec prepares a streaming endpoint to execute a command in the container.
rpc Exec(ExecRequest) returns (ExecResponse) {}

// Attach prepares a streaming endpoint to attach to a running container.
rpc Attach(AttachRequest) returns (AttachResponse) {}

// PortForward prepares a streaming endpoint to forward ports from a PodSandbox.
rpc PortForward(PortForwardRequest) returns (PortForwardResponse) {}

The requests carry everything required to allow the server to do the work, for example, the ContainerId or command (Cmd) to be run in case of Exec. More interestingly, all of their responses only contain a url:

message ExecResponse {
 // Fully qualified URL of the exec streaming server.
 string url = 1;
}

message AttachResponse {
 // Fully qualified URL of the attach streaming server.
 string url = 1;
}

message PortForwardResponse {
 // Fully qualified URL of the port-forward streaming server.
 string url = 1;
}

Why is it implemented like that? Well, the original design document for those RPCs even predates Kubernetes Enhancements Proposals (KEPs) and was originally outlined back in 2016. The kubelet had a native implementation for Exec, Attach, and PortForward before the initiative to bring the functionality to the CRI started. Before that, everything was bound to Docker or the later abandoned container runtime rkt.

The CRI related design document also elaborates on the option to use native RPC streaming for exec, attach, and port forward. The downsides outweighed this approach: the kubelet would still create a network bottleneck and future runtimes would not be free in choosing the server implementation details. Also, another option that the Kubelet implements a portable, runtime-agnostic solution has been abandoned over the final one, because this would mean another project to maintain which nevertheless would be runtime dependent.

This means, that the basic flow for Exec, Attach and PortForward was proposed to look like this:

Clients like crictl or the kubelet (via kubectl) request a new exec, attach or port forward session from the runtime using the gRPC interface. The runtime implements a streaming server that also manages the active sessions. This streaming server provides an HTTP endpoint for the client to connect to. The client upgrades the connection to use the SPDY streaming protocol or (in the future) to a WebSocket connection and starts to stream the data back and forth.

This implementation allows runtimes to have the flexibility to implement Exec, Attach and PortForward the way they want, and also allows a simple test path. Runtimes can change the underlying implementation to support any kind of feature without having a need to modify the CRI at all.

Many smaller enhancements to this overall approach have been merged into Kubernetes in the past years, but the general pattern has always stayed the same. The kubelet source code transformed into a reusable library, which is nowadays usable from container runtimes to implement the basic streaming capability.

How does the streaming actually work?

At a first glance, it looks like all three RPCs work the same way, but that's not the case. It's possible to group the functionality of Exec and Attach, while PortForward follows a distinct internal protocol definition.

Exec and Attach

Kubernetes defines Exec and Attach as remote commands, where its protocol definition exists in five different versions:

#	Version	Note
1	`channel.k8s.io`	Initial (unversioned) SPDY sub protocol (#13394, #13395)
2	`v2.channel.k8s.io`	Resolves the issues present in the first version (#15961)
3	`v3.channel.k8s.io`	Adds support for resizing container terminals (#25273)
4	`v4.channel.k8s.io`	Adds support for exit codes using JSON errors (#26541)
5	`v5.channel.k8s.io`	Adds support for a CLOSE signal (#119157)

On top of that, there is an overall effort to replace the SPDY transport protocol using WebSockets as part KEP #4006. Runtimes have to satisfy those protocols over their life cycle to stay up to date with the Kubernetes implementation.

Let's assume that a client uses the latest (v5) version of the protocol as well as communicating over WebSockets. In that case, the general flow would be:

The client requests an URL endpoint for Exec or Attach using the CRI.
- The server (runtime) validates the request, inserts it into a connection tracking cache, and provides the HTTP endpoint URL for that request.
The client connects to that URL, upgrades the connection to establish a WebSocket, and starts to stream data.
- In the case of Attach, the server has to stream the main container process data to the client.
- In the case of Exec, the server has to create the subprocess command within the container and then streams the output to the client.
If stdin is required, then the server needs to listen for that as well and redirect it to the corresponding process.

Interpreting data for the defined protocol is fairly simple: The first byte of every input and output packet defines the actual stream:

First Byte	Type	Description
`0`	standard input	Data streamed from stdin
`1`	standard output	Data streamed to stdout
`2`	standard error	Data streamed to stderr
`3`	stream error	A streaming error occurred
`4`	stream resize	A terminal resize event
`255`	stream close	Stream should be closed (for WebSockets)

How should runtimes now implement the streaming server methods for Exec and Attach by using the provided kubelet library? The key is that the streaming server implementation in the kubelet outlines an interface called Runtime which has to be fulfilled by the actual container runtime if it wants to use that library:

// Runtime is the interface to execute the commands and provide the streams.
type Runtime interface {
 Exec(ctx context.Context, containerID string, cmd []string, in io.Reader, out, err io.WriteCloser, tty bool, resize <-chan remotecommand.TerminalSize) error
 Attach(ctx context.Context, containerID string, in io.Reader, out, err io.WriteCloser, tty bool, resize <-chan remotecommand.TerminalSize) error
 PortForward(ctx context.Context, podSandboxID string, port int32, stream io.ReadWriteCloser) error
}

Everything related to the protocol interpretation is already in place and runtimes only have to implement the actual Exec and Attach logic. For example, the container runtime CRI-O does it like this pseudo code:

func (s StreamService) Exec(
 ctx context.Context,
 containerID string,
 cmd []string,
 stdin io.Reader, stdout, stderr io.WriteCloser,
 tty bool,
 resizeChan <-chan remotecommand.TerminalSize,
) error {
 // Retrieve the container by the provided containerID
 // …

 // Update the container status and verify that the workload is running
 // …

 // Execute the command and stream the data
 return s.runtimeServer.Runtime().ExecContainer(
 s.ctx, c, cmd, stdin, stdout, stderr, tty, resizeChan,
 )
}

PortForward

Forwarding ports to a container works a bit differently when comparing it to streaming IO data from a workload. The server still has to provide a URL endpoint for the client to connect to, but then the container runtime has to enter the network namespace of the container, allocate the port as well as stream the data back and forth. There is no simple protocol definition available like for Exec or Attach. This means that the client will stream the plain SPDY frames (with or without an additional WebSocket connection) which can be interpreted using libraries like moby/spdystream.

Luckily, the kubelet library already provides the PortForward interface method which has to be implemented by the runtime. CRI-O does that by (simplified):

func (s StreamService) PortForward(
 ctx context.Context,
 podSandboxID string,
 port int32,
 stream io.ReadWriteCloser,
) error {
 // Retrieve the pod sandbox by the provided podSandboxID
 sandboxID, err := s.runtimeServer.PodIDIndex().Get(podSandboxID)
 sb := s.runtimeServer.GetSandbox(sandboxID)
 // …

 // Get the network namespace path on disk for that sandbox
 netNsPath := sb.NetNsPath()
 // …

 // Enter the network namespace and stream the data
 return s.runtimeServer.Runtime().PortForwardContainer(
 ctx, sb.InfraContainer(), netNsPath, port, stream,
 )
}

Future work

The flexibility Kubernetes provides for the RPCs Exec, Attach and PortForward is truly outstanding compared to other methods. Nevertheless, container runtimes have to keep up with the latest and greatest implementations to support those features in a meaningful way. The general effort to support WebSockets is not only a plain Kubernetes thing, it also has to be supported by container runtimes as well as clients like crictl.

For example, crictl v1.30 features a new --transport flag for the subcommands exec, attach and port-forward (#1383, #1385) to allow choosing between websocket and spdy.

CRI-O is going an experimental path by moving the streaming server implementation into conmon-rs (a substitute for the container monitor conmon). conmon-rs is a Rust implementation of the original container monitor and allows streaming WebSockets directly using supported libraries (#2070). The major benefit of this approach is that CRI-O does not even have to be running while conmon-rs can keep active Exec, Attach and PortForward sessions open. The simplified flow when using crictl directly will then look like this:

sequenceDiagram autonumber participant crictl participant runtime as Container Runtime participant conmon-rs Note over crictl,runtime: Container Runtime Interface (CRI) crictl->>runtime: Exec, Attach, PortForward Note over runtime,conmon-rs: Cap’n Proto runtime->>conmon-rs: Serve Exec, Attach, PortForward conmon-rs->>runtime: HTTP endpoint (URL) runtime->>crictl: Response URL crictl-->>conmon-rs: Connection upgrade to WebSocket conmon-rs-)crictl: Stream data

All of those enhancements require iterative design decisions, while the original well-conceived implementation acts as the foundation for those. I really hope you've enjoyed this compact journey through the history of CRI RPCs. Feel free to reach out to me anytime for suggestions or feedback using the official Kubernetes Slack.

Borgo is a statically typed language that compiles to Go

2024-04-30T15:13:22.000Z

Rabbit R1: Barely Reviewable [video]

2024-04-30T00:44:13.000Z

How an empty S3 bucket can make your AWS bill explode

2024-04-29T19:42:23.000Z

aux.computer: an alternative to the Nix ecosystem

2024-04-29T19:20:59.000Z

GitHub Copilot Workspace: Copilot-native developer environment

2024-04-29T16:03:03.000Z

Tiered storage won't fix Kafka

2024-04-29T14:54:38.000Z

Show HN: Docker-phobia: Analyze Docker image size with a treemap

2024-04-28T13:01:12.000Z

Common DB schema change mistakes in Postgres

2024-04-28T07:30:40.000Z

Software Supply Chain Security

2024-04-27T03:31:15.000Z

Gems of Tailwind CSS

2024-04-26T16:36:21.000Z

Optimizing your programs for Arm platforms

2024-04-26T12:10:59.000Z

A Logic Language for Distributed SQL Queries

2024-04-25T21:13:52.000Z

The Stainless SDK Generator

2024-04-24T16:34:35.000Z

Double Edged Sword of Docker: Balancing Benefits and Risks

2024-04-23T15:19:32.000Z

What it was like going through YC W24

2024-04-22T18:55:59.000Z

Some Volumes Were Slow and We Figured Out Why

2024-04-19T22:47:51.000Z

RHttp: REPL for HTTP

2024-04-17T21:17:58.000Z

Ask HN: What software sparks joy when using?

2024-04-17T09:26:47.000Z

Caching secrets of the HTTP elders, part 1

2024-04-17T05:19:55.000Z

Effect 3.0

2024-04-16T15:44:15.000Z

Launching Distributed Authorization

2024-04-16T14:51:48.000Z

Distributed Authorization

2024-04-16T14:51:48.000Z

Product-Market Fit Isn't a Black Box – A New Framework to Help B2B Founders

2024-04-16T13:48:55.000Z

IPv6 for the Remotely Interested

2024-04-16T08:39:04.000Z

Sandboxing All the Things with Flatpak and BubbleBox

2024-04-14T19:04:19.000Z

Iran launches drone attack against Israel

2024-04-13T20:59:45.000Z

The Arc Product-Market Fit Framework

2024-04-13T04:55:41.000Z

Lessons after a Half-billion GPT Tokens

2024-04-12T17:06:38.000Z

Ten Years and Counting: My Affair with Microservices

2024-04-12T08:52:01.000Z

DwarFS – Deduplicating Warp-Speed Advanced Read-Only File System

2024-04-12T02:04:54.000Z

The Open Secret about Confidential Computing

2024-04-09T16:30:37.000Z

Google Axion Processors – Arm-based CPUs designed for the data center

2024-04-09T12:12:51.000Z

Building a Managed Postgres Service in Rust

2024-04-08T07:40:40.000Z

Architecture decisions in Neon, a serverless Postgres service

2024-04-08T00:04:44.000Z

Rpgp: Pure Rust implementation of OpenPGP

2024-04-07T16:50:30.000Z

Show HN: I open-sourced the in-memory PostgreSQL I built at work for E2E tests

2024-04-07T13:13:43.000Z

Nix – A One Pager

2024-04-07T02:44:42.000Z

My favorite button on the internet

2024-04-06T17:19:14.000Z

HashiCorp: OpenTofu was not respecting the terms of its BSL license

2024-04-06T15:06:41.000Z

Infrastructure as Code Is Not the Answer – By Luke Shaughnessy

2024-04-05T10:16:38.000Z

OpenTelemetry Is Too Complicated, VictoriaMetrics Says

2024-04-05T08:40:41.000Z

Features I wish PostgreSQL had as a developer

2024-04-05T07:06:21.000Z

Postgres locks explorer

2024-04-05T05:45:36.000Z

Vercel: Improved Infrastructure Pricing

2024-04-04T16:52:01.000Z

Developers, what marketing strategies work on you?

2024-04-04T16:26:28.000Z

Show HN: Managed GitHub Actions Runners for AWS

2024-04-04T14:32:20.000Z

How we Built a 19 PiB Logging Platform with ClickHouse and Saved Millions

2024-04-03T14:41:34.000Z

Show HN: Goralim - a rate limiting pkg for Go to handle distributed workloads

2024-04-03T05:52:35.000Z

Data Flow Analysis for Go

2024-04-02T18:29:40.000Z

NixOS is not reproducible

2024-04-02T17:24:55.000Z

Python Cloudflare Workers

2024-04-02T13:20:21.000Z

KraftCloud

2024-04-02T06:58:36.000Z

Unikraft Launches KraftCloud: Never Pay for Idle Again

2024-04-02T06:35:03.000Z

Landlock: Unprivileged Access Control

2024-03-30T16:06:57.000Z

About the Tailscale.com outage on March 7, 2024

2024-03-30T15:34:58.000Z

PHP: In light of xz backdoor, consider removing autogenerated files from release

2024-03-30T11:58:23.000Z

We are under DDoS attack and we do nothing

2024-03-30T07:35:30.000Z

Bpfman: An eBPF Manager

2024-03-30T04:49:45.000Z

Do you really need Kubernetes?

2024-03-29T20:22:04.000Z

Radicle: Peer-to-Peer Collaboration with Git

2024-03-29T20:12:57.000Z

Overlay Networks Based on WebRTC

2024-03-29T17:00:07.000Z

Beware Scaling on AWS in Early Days

2024-03-29T16:41:12.000Z

How Radicle Works Under the Hood

2024-03-27T09:47:32.000Z

The What, Why and How of Containers

2024-03-27T09:00:50.000Z

Podman 5.0 has been released

2024-03-26T18:20:55.000Z

`async: false` is the worst

2024-03-26T17:52:45.000Z

Elixir's Impact: Shaping the Evolution of Erlang

2024-03-26T09:14:13.000Z

Nix at Scale

2024-03-26T02:52:30.000Z

Speeding up Azure development by not using Terraform

2024-03-25T19:35:44.000Z

SPQR 1.3.0: a production-ready system for horizontal scaling of PostgreSQL

2024-03-25T11:24:39.000Z

Turbocall – Just-in-time compiler for Deno FFI

2024-03-25T07:41:48.000Z

Build time is a collective responsibility

2024-03-24T11:53:22.000Z

D2 Playground

2024-03-24T06:56:58.000Z

Oxide Cloud Computer. No Cables. No Assembly. Just Cloud

2024-03-24T00:20:40.000Z

Ntex: Powerful, pragmatic, fast framework for composable networking services

2024-03-23T17:12:46.000Z

Nix The Planet

2024-03-22T20:14:47.000Z

Pack: A New Container Format for Compressed Files

2024-03-22T19:11:45.000Z

Tackling Tail Latency with eBPF

2024-03-22T09:43:39.000Z

Redis License Changed

2024-03-21T06:12:04.000Z

I Improved My Rust Compile Times

2024-03-20T04:14:35.000Z

Build System Schism: The Curse Of Meta Build Systems

2024-03-20T01:12:07.000Z

Stardew Valley 1.6 Changelog

2024-03-19T19:21:55.000Z

Fault tolerance and resilience patterns for Go

2024-03-19T18:19:32.000Z

Show HN: Causal 2.0 – Modern Financial Planning for Startups

2024-03-19T14:06:25.000Z

Inversion: Fast, Reliable Structured LLMs

2024-03-18T21:10:38.000Z

Xata, a new serverless Postgres platform

2024-03-18T18:24:01.000Z

Homepage Design: 5 Fundamental Principles

2024-03-17T19:25:33.000Z

Ask HN: Books that gave you different perspective on religion

2024-03-16T20:39:31.000Z

Hathora's Bare Metal Journey

2024-03-15T21:17:06.000Z

Nix is a better Docker image builder than Docker's image builder

2024-03-15T19:56:47.000Z

IAM Is the Worst

2024-03-15T10:55:22.000Z

Programs written in Golang have no secrets

2024-03-15T07:07:46.000Z

More powerful Go execution traces

2024-03-14T20:30:17.000Z

Nanos – A Unikernel

2024-03-13T23:43:32.000Z

S3 as the universal infrastructure backend

2024-03-13T19:46:42.000Z

Show HN: Flox 1.0 – Open-source dev env as code with Nix

2024-03-13T15:44:29.000Z

JIT WireGuard

2024-03-13T06:13:09.000Z

S3 is files, but not a filesystem

2024-03-10T04:11:26.000Z

Claude Sonnet and Pi AI give exact same response to a prompt

2024-03-08T13:03:43.000Z

Show HN: dockerc – Docker image to static executable "compiler"

2024-03-06T19:55:22.000Z

The End of Airplane.dev

2024-03-06T18:16:58.000Z

Linux kernel security tunables everyone should consider adopting

2024-03-06T14:00:43.000Z

The Linux kernel is the heart of many modern production systems. It decides when any code is allowed to run and which programs/users can access which resources. It manages memory, mediates access to hardware, and does a bulk of work under the hood on behalf of programs running on top. Since the kernel is always involved in any code execution, it is in the best position to protect the system from malicious programs, enforce the desired system security policy, and provide security features for safer production environments.

In this post, we will review some Linux kernel security configurations we use at Cloudflare and how they help to block or minimize a potential system compromise.

Secure boot

When a machine (either a laptop or a server) boots, it goes through several boot stages:

Within a secure boot architecture each stage from the above diagram verifies the integrity of the next stage before passing execution to it, thus forming a so-called secure boot chain. This way “trustworthiness” is extended to every component in the boot chain, because if we verified the code integrity of a particular stage, we can trust this code to verify the integrity of the next stage.

We have previously covered how Cloudflare implements secure boot in the initial stages of the boot process. In this post, we will focus on the Linux kernel.

Secure boot is the cornerstone of any operating system security mechanism. The Linux kernel is the primary enforcer of the operating system security configuration and policy, so we have to be sure that the Linux kernel itself has not been tampered with. In our previous post about secure boot we showed how we use UEFI Secure Boot to ensure the integrity of the Linux kernel.

But what happens next? After the kernel gets executed, it may try to load additional drivers, or as they are called in the Linux world, kernel modules. And kernel module loading is not confined just to the boot process. A module can be loaded at any time during runtime — a new device being plugged in and a driver is needed, some additional extensions in the networking stack are required (for example, for fine-grained firewall rules), or just manually by the system administrator.

However, uncontrolled kernel module loading might pose a significant risk to system integrity. Unlike regular programs, which get executed as user space processes, kernel modules are pieces of code which get injected and executed directly in the Linux kernel address space. There is no separation between the code and data in different kernel modules and core kernel subsystems, so everything can access everything. This means that a rogue kernel module can completely nullify the trustworthiness of the operating system and make secure boot useless. As an example, consider a simple Debian 12 (Bookworm installation), but with SELinux configured and enforced:

ignat@dev:~$ lsb_release --all
No LSB modules are available.
Distributor ID:	Debian
Description:	Debian GNU/Linux 12 (bookworm)
Release:	12
Codename:	bookworm
ignat@dev:~$ uname -a
Linux dev 6.1.0-18-cloud-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.76-1 (2024-02-01) x86_64 GNU/Linux
ignat@dev:~$ sudo getenforce
Enforcing

Now we need to do some research. First, we see that we’re running 6.1.76 Linux Kernel. If we explore the source code, we would see that inside the kernel, the SELinux configuration is stored in a singleton structure, which is defined as follows:

struct selinux_state {
#ifdef CONFIG_SECURITY_SELINUX_DISABLE
	bool disabled;
#endif
#ifdef CONFIG_SECURITY_SELINUX_DEVELOP
	bool enforcing;
#endif
	bool checkreqprot;
	bool initialized;
	bool policycap[__POLICYDB_CAP_MAX];

	struct page *status_page;
	struct mutex status_lock;

	struct selinux_avc *avc;
	struct selinux_policy __rcu *policy;
	struct mutex policy_mutex;
} __randomize_layout;

From the above, we can see that if the kernel configuration has CONFIG_SECURITY_SELINUX_DEVELOP enabled, the structure would have a boolean variable enforcing, which controls the enforcement status of SELinux at runtime. This is exactly what the above $ sudo getenforce command returns. We can double check that the Debian kernel indeed has the configuration option enabled:

ignat@dev:~$ grep CONFIG_SECURITY_SELINUX_DEVELOP /boot/config-`uname -r`
CONFIG_SECURITY_SELINUX_DEVELOP=y

Good! Now that we have a variable in the kernel, which is responsible for some security enforcement, we can try to attack it. One problem though is the __randomize_layout attribute: since CONFIG_SECURITY_SELINUX_DISABLE is actually not set for our Debian kernel, normally enforcing would be the first member of the struct. Thus if we know where the struct is, we immediately know the position of the enforcing flag. With __randomize_layout, during kernel compilation the compiler might place members at arbitrary positions within the struct, so it is harder to create generic exploits. But arbitrary struct randomization within the kernel may introduce performance impact, so is often disabled and it is disabled for the Debian kernel:

ignat@dev:~$ grep RANDSTRUCT /boot/config-`uname -r`
CONFIG_RANDSTRUCT_NONE=y

We can also confirm the compiled position of the enforcing flag using the pahole tool and either kernel debug symbols, if available, or (on modern kernels, if enabled) in-kernel BTF information. We will use the latter:

ignat@dev:~$ pahole -C selinux_state /sys/kernel/btf/vmlinux
struct selinux_state {
	bool                       enforcing;            /*     0     1 */
	bool                       checkreqprot;         /*     1     1 */
	bool                       initialized;          /*     2     1 */
	bool                       policycap[8];         /*     3     8 */

	/* XXX 5 bytes hole, try to pack */

	struct page *              status_page;          /*    16     8 */
	struct mutex               status_lock;          /*    24    32 */
	struct selinux_avc *       avc;                  /*    56     8 */
	/* --- cacheline 1 boundary (64 bytes) --- */
	struct selinux_policy *    policy;               /*    64     8 */
	struct mutex               policy_mutex;         /*    72    32 */

	/* size: 104, cachelines: 2, members: 9 */
	/* sum members: 99, holes: 1, sum holes: 5 */
	/* last cacheline: 40 bytes */
};

So enforcing is indeed located at the start of the structure and we don’t even have to be a privileged user to confirm this.

Great! All we need is the runtime address of the selinux_state variable inside the kernel:
(shell/bash)

ignat@dev:~$ sudo grep selinux_state /proc/kallsyms
ffffffffbc3bcae0 B selinux_state

With all the information, we can write an almost textbook simple kernel module to manipulate the SELinux state:

Mymod.c:

#include 

static int __init mod_init(void)
{
	bool *selinux_enforce = (bool *)0xffffffffbc3bcae0;
	*selinux_enforce = false;
	return 0;
}

static void mod_fini(void)
{
}

module_init(mod_init);
module_exit(mod_fini);

MODULE_DESCRIPTION("A somewhat malicious module");
MODULE_AUTHOR("Ignat Korchagin ");
MODULE_LICENSE("GPL");

And the respective Kbuild file:

obj-m := mymod.o

With these two files we can build a full fledged kernel module according to the official kernel docs:

ignat@dev:~$ cd mymod/
ignat@dev:~/mymod$ ls
Kbuild  mymod.c
ignat@dev:~/mymod$ make -C /lib/modules/`uname -r`/build M=$PWD
make: Entering directory '/usr/src/linux-headers-6.1.0-18-cloud-amd64'
  CC [M]  /home/ignat/mymod/mymod.o
  MODPOST /home/ignat/mymod/Module.symvers
  CC [M]  /home/ignat/mymod/mymod.mod.o
  LD [M]  /home/ignat/mymod/mymod.ko
  BTF [M] /home/ignat/mymod/mymod.ko
Skipping BTF generation for /home/ignat/mymod/mymod.ko due to unavailability of vmlinux
make: Leaving directory '/usr/src/linux-headers-6.1.0-18-cloud-amd64'

If we try to load this module now, the system may not allow it due to the SELinux policy:

ignat@dev:~/mymod$ sudo insmod mymod.ko
insmod: ERROR: could not load module mymod.ko: Permission denied

We can workaround it by copying the module into the standard module path somewhere:

ignat@dev:~/mymod$ sudo cp mymod.ko /lib/modules/`uname -r`/kernel/crypto/

Now let’s try it out:

ignat@dev:~/mymod$ sudo getenforce
Enforcing
ignat@dev:~/mymod$ sudo insmod /lib/modules/`uname -r`/kernel/crypto/mymod.ko
ignat@dev:~/mymod$ sudo getenforce
Permissive

Not only did we disable the SELinux protection via a malicious kernel module, we did it quietly. Normal sudo setenforce 0, even if allowed, would go through the official selinuxfs interface and would emit an audit message. Our code manipulated the kernel memory directly, so no one was alerted. This illustrates why uncontrolled kernel module loading is very dangerous and that is why most security standards and commercial security monitoring products advocate for close monitoring of kernel module loading.

But we don’t need to monitor kernel modules at Cloudflare. Let’s repeat the exercise on a Cloudflare production kernel (module recompilation skipped for brevity):

ignat@dev:~/mymod$ uname -a
Linux dev 6.6.17-cloudflare-2024.2.9 #1 SMP PREEMPT_DYNAMIC Mon Sep 27 00:00:00 UTC 2010 x86_64 GNU/Linux
ignat@dev:~/mymod$ sudo insmod /lib/modules/`uname -r`/kernel/crypto/mymod.ko
insmod: ERROR: could not insert module /lib/modules/6.6.17-cloudflare-2024.2.9/kernel/crypto/mymod.ko: Key was rejected by service

We get a Key was rejected by service error when trying to load a module, and the kernel log will have the following message:

ignat@dev:~/mymod$ sudo dmesg | tail -n 1
[41515.037031] Loading of unsigned module is rejected

This is because the Cloudflare kernel requires all the kernel modules to have a valid signature, so we don’t even have to worry about a malicious module being loaded at some point:

ignat@dev:~$ grep MODULE_SIG_FORCE /boot/config-`uname -r`
CONFIG_MODULE_SIG_FORCE=y

For completeness it is worth noting that the Debian stock kernel also supports module signatures, but does not enforce it:

ignat@dev:~$ grep MODULE_SIG /boot/config-6.1.0-18-cloud-amd64
CONFIG_MODULE_SIG_FORMAT=y
CONFIG_MODULE_SIG=y
# CONFIG_MODULE_SIG_FORCE is not set
…

The above configuration means that the kernel will validate a module signature, if available. But if not - the module will be loaded anyway with a warning message emitted and the kernel will be tainted.

Key management for kernel module signing

Signed kernel modules are great, but it creates a key management problem: to sign a module we need a signing keypair that is trusted by the kernel. The public key of the keypair is usually directly embedded into the kernel binary, so the kernel can easily use it to verify module signatures. The private key of the pair needs to be protected and secure, because if it is leaked, anyone could compile and sign a potentially malicious kernel module which would be accepted by our kernel.

But what is the best way to eliminate the risk of losing something? Not to have it in the first place! Luckily the kernel build system will generate a random keypair for module signing, if none is provided. At Cloudflare, we use that feature to sign all the kernel modules during the kernel compilation stage. When the compilation and signing is done though, instead of storing the key in a secure place, we just destroy the private key:

So with the above process:

The kernel build system generated a random keypair, compiles the kernel and modules
The public key is embedded into the kernel image, the private key is used to sign all the modules
The private key is destroyed

With this scheme not only do we not have to worry about module signing key management, we also use a different key for each kernel we release to production. So even if a particular build process is hijacked and the signing key is not destroyed and potentially leaked, the key will no longer be valid when a kernel update is released.

There are some flexibility downsides though, as we can’t “retrofit” a new kernel module for an already released kernel (for example, for a new piece of hardware we are adopting). However, it is not a practical limitation for us as we release kernels often (roughly every week) to keep up with a steady stream of bug fixes and vulnerability patches in the Linux Kernel.

KEXEC

KEXEC (or kexec_load()) is an interesting system call in Linux, which allows for one kernel to directly execute (or jump to) another kernel. The idea behind this is to switch/update/downgrade kernels faster without going through a full reboot cycle to minimize the potential system downtime. However, it was developed quite a while ago, when secure boot and system integrity was not quite a concern. Therefore its original design has security flaws and is known to be able to bypass secure boot and potentially compromise system integrity.

We can see the problems just based on the definition of the system call itself:

struct kexec_segment {
	const void *buf;
	size_t bufsz;
	const void *mem;
	size_t memsz;
};
...
long kexec_load(unsigned long entry, unsigned long nr_segments, struct kexec_segment *segments, unsigned long flags);

So the kernel expects just a collection of buffers with code to execute. Back in those days there was not much desire to do a lot of data parsing inside the kernel, so the idea was to parse the to-be-executed kernel image in user space and provide the kernel with only the data it needs. Also, to switch kernels live, we need an intermediate program which would take over while the old kernel is shutting down and the new kernel has not yet been executed. In the kexec world this program is called purgatory. Thus the problem is evident: we give the kernel a bunch of code and it will happily execute it at the highest privilege level. But instead of the original kernel or purgatory code, we can easily provide code similar to the one demonstrated earlier in this post, which disables SELinux (or does something else to the kernel).

At Cloudflare we have had kexec_load() disabled for some time now just because of this. The advantage of faster reboots with kexec comes with a (small) risk of improperly initialized hardware, so it was not worth using it even without the security concerns. However, kexec does provide one useful feature — it is the foundation of the Linux kernel crashdumping solution. In a nutshell, if a kernel crashes in production (due to a bug or some other error), a backup kernel (previously loaded with kexec) can take over, collect and save the memory dump for further investigation. This allows to more effectively investigate kernel and other issues in production, so it is a powerful tool to have.

Luckily, since the original problems with kexec were outlined, Linux developed an alternative secure interface for kexec: instead of buffers with code it expects file descriptors with the to-be-executed kernel image and initrd and does parsing inside the kernel. Thus, only a valid kernel image can be supplied. On top of this, we can configure and require kexec to ensure the provided images are properly signed, so only authorized code can be executed in the kexec scenario. A secure configuration for kexec looks something like this:

ignat@dev:~$ grep KEXEC /boot/config-`uname -r`
CONFIG_KEXEC_CORE=y
CONFIG_HAVE_IMA_KEXEC=y
# CONFIG_KEXEC is not set
CONFIG_KEXEC_FILE=y
CONFIG_KEXEC_SIG=y
CONFIG_KEXEC_SIG_FORCE=y
CONFIG_KEXEC_BZIMAGE_VERIFY_SIG=y
…

Above we ensure that the legacy kexec_load() system call is disabled by disabling CONFIG_KEXEC, but still can configure Linux Kernel crashdumping via the new kexec_file_load() system call via CONFIG_KEXEC_FILE=y with enforced signature checks (CONFIG_KEXEC_SIG=y and CONFIG_KEXEC_SIG_FORCE=y).

Note that stock Debian kernel has the legacy kexec_load() system call enabled and does not enforce signature checks for kexec_file_load() (similar to module signature checks):

ignat@dev:~$ grep KEXEC /boot/config-6.1.0-18-cloud-amd64
CONFIG_KEXEC=y
CONFIG_KEXEC_FILE=y
CONFIG_ARCH_HAS_KEXEC_PURGATORY=y
CONFIG_KEXEC_SIG=y
# CONFIG_KEXEC_SIG_FORCE is not set
CONFIG_KEXEC_BZIMAGE_VERIFY_SIG=y
…

Kernel Address Space Layout Randomization (KASLR)

Even on the stock Debian kernel if you try to repeat the exercise we described in the “Secure boot” section of this post after a system reboot, you will likely see it would fail to disable SELinux now. This is because we hardcoded the kernel address of the selinux_state structure in our malicious kernel module, but the address changed now:

ignat@dev:~$ sudo grep selinux_state /proc/kallsyms
ffffffffb41bcae0 B selinux_state

Kernel Address Space Layout Randomization (or KASLR) is a simple concept: it slightly and randomly shifts the kernel code and data on each boot:

This is to combat targeted exploitation (like the malicious module in this post) based on the knowledge of the location of internal kernel structures and code. It is especially useful for popular Linux distribution kernels, like the Debian one, because most users use the same binary and anyone can download the debug symbols and the System.map file with all the addresses of the kernel internals. Just to note: it will not prevent the module loading and doing harm, but it will likely not achieve the targeted effect of disabling SELinux. Instead, it will modify a random piece of kernel memory potentially causing the kernel to crash.

Both the Cloudflare kernel and the Debian one have this feature enabled:

ignat@dev:~$ grep RANDOMIZE_BASE /boot/config-`uname -r`
CONFIG_RANDOMIZE_BASE=y

Restricted kernel pointers

While KASLR helps with targeted exploits, it is quite easy to bypass since everything is shifted by a single random offset as shown on the diagram above. Thus if the attacker knows at least one runtime kernel address, they can recover this offset by subtracting the runtime address from the compile time address of the same symbol (function or data structure) from the kernel’s System.map file. Once they know the offset, they can recover the addresses of all other symbols by adjusting them by this offset.

Therefore, modern kernels take precautions not to leak kernel addresses at least to unprivileged users. One of the main tunables for this is the kptr_restrict sysctl. It is a good idea to set it at least to 1 to not allow regular users to see kernel pointers:
(shell/bash)

ignat@dev:~$ sudo sysctl -w kernel.kptr_restrict=1
kernel.kptr_restrict = 1
ignat@dev:~$ grep selinux_state /proc/kallsyms
0000000000000000 B selinux_state

Privileged users can still see the pointers:

ignat@dev:~$ sudo grep selinux_state /proc/kallsyms
ffffffffb41bcae0 B selinux_state

Similar to kptr_restrict sysctl there is also dmesg_restrict, which if set, would prevent regular users from reading the kernel log (which may also leak kernel pointers via its messages). While you need to explicitly set kptr_restrict sysctl to a non-zero value on each boot (or use some system sysctl configuration utility, like this one), you can configure dmesg_restrict initial value via the CONFIG_SECURITY_DMESG_RESTRICT kernel configuration option. Both the Cloudflare kernel and the Debian one enforce dmesg_restrict this way:

ignat@dev:~$ grep CONFIG_SECURITY_DMESG_RESTRICT /boot/config-`uname -r`
CONFIG_SECURITY_DMESG_RESTRICT=y

Worth noting that /proc/kallsyms and the kernel log are not the only sources of potential kernel pointer leaks. There is a lot of legacy in the Linux kernel and [new sources are continuously being found and patched]. That’s why it is very important to stay up to date with the latest kernel bugfix releases.

Lockdown LSM

Linux Security Modules (LSM) is a hook-based framework for implementing security policies and Mandatory Access Control in the Linux Kernel. We have [covered our usage of another LSM module, BPF-LSM, previously].

BPF-LSM is a useful foundational piece for our kernel security, but in this post we want to mention another useful LSM module we use — the Lockdown LSM. Lockdown can be in three states (controlled by the /sys/kernel/security/lockdown special file):

ignat@dev:~$ cat /sys/kernel/security/lockdown
[none] integrity confidentiality

none is the state where nothing is enforced and the module is effectively disabled. When Lockdown is in the integrity state, the kernel tries to prevent any operation, which may compromise its integrity. We already covered some examples of these in this post: loading unsigned modules and executing unsigned code via KEXEC. But there are other potential ways (which are mentioned in the LSM’s man page), all of which this LSM tries to block. confidentiality is the most restrictive mode, where Lockdown will also try to prevent any information leakage from the kernel. In practice this may be too restrictive for server workloads as it blocks all runtime debugging capabilities, like perf or eBPF.

Let’s see the Lockdown LSM in action. On a barebones Debian system the initial state is none meaning nothing is locked down:

ignat@dev:~$ uname -a
Linux dev 6.1.0-18-cloud-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.76-1 (2024-02-01) x86_64 GNU/Linux
ignat@dev:~$ cat /sys/kernel/security/lockdown
[none] integrity confidentiality

We can switch the system into the integrity mode:

ignat@dev:~$ echo integrity | sudo tee /sys/kernel/security/lockdown
integrity
ignat@dev:~$ cat /sys/kernel/security/lockdown
none [integrity] confidentiality

It is worth noting that we can only put the system into a more restrictive state, but not back. That is, once in integrity mode we can only switch to confidentiality mode, but not back to none:

ignat@dev:~$ echo none | sudo tee /sys/kernel/security/lockdown
none
tee: /sys/kernel/security/lockdown: Operation not permitted

Now we can see that even on a stock Debian kernel, which as we discovered above, does not enforce module signatures by default, we cannot load a potentially malicious unsigned kernel module anymore:

ignat@dev:~$ sudo insmod mymod/mymod.ko
insmod: ERROR: could not insert module mymod/mymod.ko: Operation not permitted

And the kernel log will helpfully point out that this is due to Lockdown LSM:

ignat@dev:~$ sudo dmesg | tail -n 1
[21728.820129] Lockdown: insmod: unsigned module loading is restricted; see man kernel_lockdown.7

As we can see, Lockdown LSM helps to tighten the security of a kernel, which otherwise may not have other enforcing bits enabled, like the stock Debian one.

If you compile your own kernel, you can go one step further and set the initial state of the Lockdown LSM to be more restrictive than none from the start. This is exactly what we did for the Cloudflare production kernel:

ignat@dev:~$ grep LOCK_DOWN /boot/config-6.6.17-cloudflare-2024.2.9
# CONFIG_LOCK_DOWN_KERNEL_FORCE_NONE is not set
CONFIG_LOCK_DOWN_KERNEL_FORCE_INTEGRITY=y
# CONFIG_LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY is not set

Conclusion

In this post we reviewed some useful Linux kernel security configuration options we use at Cloudflare. This is only a small subset, and there are many more available and even more are being constantly developed, reviewed, and improved by the Linux kernel community. We hope that this post will shed some light on these security features and that, if you haven’t already, you may consider enabling them in your Linux systems.

Watch on Cloudflare TV

Tune in for more news, announcements and thought-provoking discussions! Don't miss the full Security Week hub page.

Nuke: A memory arena implementation for Go

2024-03-05T06:22:03.000Z

Show HN: Workflow Orchestrator in Golang

2024-03-04T17:21:42.000Z

Gleam Version 1

2024-03-04T13:53:47.000Z

The hater's guide to Kubernetes

2024-03-03T16:44:05.000Z

Scaleway launches RISC-V servers

2024-03-03T11:01:32.000Z

How to convert Node.js stream callback functions into an Async Iterator

2024-03-03T03:06:52.000Z

How To Build AWS-Compatible APIs: AWS Sigv4

2024-02-29T12:51:54.000Z

Pql, a pipelined query language that compiles to SQL (written in Go)

2024-02-28T15:33:56.000Z

Pingora: HTTP Server and Proxy Library, in Rust, by Cloudflare, Released

2024-02-28T09:56:16.000Z

Pingora: build fast, reliable and programmable networked systems

2024-02-28T09:56:16.000Z

Testcontainers

2024-02-27T23:15:57.000Z

Software Infrastructure 2.0: A Wishlist

2024-02-27T11:23:17.000Z

Bpftop: Streamlining eBPF performance optimization

2024-02-27T01:04:09.000Z

Reducing our AWS bill by $100k

2024-02-27T00:16:47.000Z

Multi – Multiplayer Collaboration for macOS

2024-02-26T04:54:45.000Z

Disillusioned with Deno

2024-02-26T04:42:39.000Z

Building a Fly.io-Like Scheduler with Resource Requirements

2024-02-25T12:53:00.000Z

Building a Fly.io-like scheduler with resource requirements

2024-02-25T12:53:00.000Z

Resend – Incident report for February 21st, 2024

2024-02-23T03:00:48.000Z

Building a Fly.io-like Scheduler

2024-02-22T12:19:24.000Z

SN Pro Typeface

2024-02-22T12:01:42.000Z

Auth0 OSS alternative Ory Kratos now with passwordless and SMS support

2024-02-22T11:41:48.000Z

Webhooks suck, but here are alternatives

2024-02-21T18:31:36.000Z

How the open source Caddy server uses Grafana Cloud for full-stack observability

2024-02-21T18:30:23.000Z

Remix Vite Is Now Stable

2024-02-21T13:52:01.000Z

Readyset: A MySQL and Postgres wire-compatible caching layer

2024-02-21T10:09:57.000Z

SSDs have become fast, except in the cloud

2024-02-20T16:59:43.000Z

Writing a scheduler for Linux in Rust that runs in user-space

2024-02-20T15:26:11.000Z

Stonelifting Etiquette

2024-02-20T11:19:35.000Z

How to debug your initramfs init

2024-02-20T09:50:18.000Z

How to make a better default Firefox UI

2024-02-19T11:02:39.000Z

The Logitech G Cloud

2024-02-19T09:18:20.000Z

Software infrastructure 2.0: a wishlist

2024-02-18T20:27:24.000Z

Tempo

2024-02-18T17:35:19.000Z

Revolutionizing PostgreSQL Schema Changes with pg-osc

2024-02-18T13:22:34.000Z

Basic proxy implementation using io_uring

2024-02-17T23:46:17.000Z

From 1s to 4ms

2024-02-17T23:23:27.000Z

Nebula is not the fastest mesh VPN, but neither are the others

2024-02-17T18:55:24.000Z

Show HN: macOS-cross-compiler – Compile binaries for macOS on Linux

2024-02-17T14:41:56.000Z

Dependency solving in Nix

2024-02-16T00:24:47.000Z

Holepunch Unveils P2P Platform "Pear Runtime"

2024-02-14T19:21:07.000Z

Show HN: Natural Language to SQL "Text-to-SQL" API

2024-02-14T19:03:55.000Z

YC: Requests for Startups

2024-02-14T16:31:57.000Z

We sped up time series by 20-30x

2024-02-14T14:03:37.000Z

Fly.io Has GPUs Now

2024-02-13T22:06:51.000Z

Switching from S3 to Tigris on Fly.io

2024-02-13T18:22:04.000Z

systemd by Example (2021)

2024-02-12T11:48:51.000Z

The Erlang Runtime System

2024-02-12T07:24:43.000Z

OpenZFS bug reports for native encryption

2024-02-12T04:26:55.000Z

What it was like working for Gitlab

2024-02-11T07:29:04.000Z

How much 1 TB of egress costs by cloud provider

2024-02-10T20:02:10.000Z

Hono v4.0.0

2024-02-09T11:55:10.000Z

Almost every infrastructure decision I endorse or regret

2024-02-09T11:05:39.000Z

LLRT: A low-latency JavaScript runtime from AWS

2024-02-08T16:46:06.000Z

VirtualBox KVM Public Release

2024-02-08T10:20:25.000Z

Context Control in Go

2024-02-08T08:40:58.000Z

Rebuilding Netflix's video processing pipeline with microservices

2024-02-08T08:22:56.000Z

When "letting it crash" is not enough

2024-02-08T05:07:57.000Z

Unison Cloud

2024-02-07T16:59:50.000Z

SQLite Isn't Enough

2024-02-07T14:52:24.000Z

TechCrunch+ Termination

2024-02-07T11:28:44.000Z

Apple to EU: "Go fuck yourself"

2024-02-07T08:11:45.000Z

BuildKit in depth: Docker's build engine explained

2024-02-06T17:31:24.000Z

How we got fine-tuning Mistral-7B to not suck

2024-02-06T07:12:19.000Z

Weaveworks Is Shuting Down

2024-02-05T16:51:18.000Z

Show HN: Natural-SQL-7B, a strong text-to-SQL model

2024-02-05T14:22:36.000Z

Deno in 2023

2024-02-05T09:23:53.000Z

Folk Computer

2024-02-03T15:52:40.000Z

Make Invalid States Unrepresentable

2024-02-01T20:49:16.000Z

Macaroons Escalated Quickly

2024-01-31T14:39:19.000Z

Introducing Foundations - our open source Rust service foundation library

2024-01-31T13:39:29.000Z

Proton, a fast and lightweight alternative to Apache Flink

2024-01-30T22:42:36.000Z

Show HN: Open-source x64 and Arm GitHub runners. Reduces GitHub Actions bill 10x

2024-01-30T16:14:42.000Z

Show HN: Open-source x64 and Arm GitHub runners

2024-01-30T16:14:42.000Z

The Big Little Guide to Message Queues

2024-01-29T19:08:52.000Z

Meta AI releases Code Llama 70B

2024-01-29T17:11:34.000Z

Just

2024-01-28T16:31:10.000Z

Garry Tan tech CEO and campaign donor wishes death upon SF politicians

2024-01-28T04:12:13.000Z

Hello IPv6: a minimal tutorial for IPv4 users

2024-01-27T20:09:19.000Z

Oasis – a small, statically-linked Linux system

2024-01-26T14:11:36.000Z

We build X.509 chains so you don't have to

2024-01-25T14:20:28.000Z

Good DevEx increases productivity. Here is the data

2024-01-24T16:18:33.000Z

Post-Serverless Era Trends

2024-01-22T19:26:02.000Z

CLI tool, written in Rust, to diff directory snapshots

2024-01-22T19:14:30.000Z

Cutting down AWS cost by $150k per year simply by shutting things off

2024-01-22T16:33:57.000Z

Should I Open Source my Company?

2024-01-22T09:40:13.000Z

The size of abacklog is inversely proportional to how often we talk to customers

2024-01-21T19:18:00.000Z

Sourcehut network outage post-mortem

2024-01-19T15:55:40.000Z

Read 65 remaining paragraphs | Comments

How a 27-year-old busted the myth of Bitcoin’s anonymity

2024-01-18T11:00:06.000Z

Enlarge (credit: Sam Rodriguez)

JUST OVER A DECADE AGO, Bitcoin appeared to many of its adherents to be the crypto-anarchist holy grail: truly private digital cash for the Internet.

Satoshi Nakamoto, the cryptocurrency’s mysterious and unidentifiable inventor, had stated in an email introducing Bitcoin that “participants can be anonymous.” And the Silk Road dark-web drug market seemed like living proof of that potential, enabling the sale of hundreds of millions of dollars in illegal drugs and other contraband for bitcoin while flaunting its impunity from law enforcement.

This is the story of the revelation in late 2013 that Bitcoin was, in fact, the opposite of untraceable—that its blockchain would actually allow researchers, tech companies, and law enforcement to trace and identify users with even more transparency than the existing financial system. That discovery would upend the world of cybercrime. Bitcoin tracing would, over the next few years, solve the mystery of the theft of a half-billion dollar stash of bitcoins from the world’s first crypto exchange, help enable the biggest dark-web drug market takedown in history, lead to the arrest of hundreds of pedophiles around the world in the bust of the dark web’s largest child sexual abuse video site, and result in the first-, second-, and third-biggest law enforcement monetary seizures in the history of the US Justice Department.

Systemd: Enable Indefinite Service Restarts

2024-01-17T20:08:34.000Z

Cloudflare R2-Backed Nix Binary Cache on Fly.io

2024-01-17T16:32:40.000Z

Meta's serverless platform processing trillions of function calls a day (2023)

2024-01-17T15:57:03.000Z

Apple built iCloud to store billions of databases

2024-01-17T15:05:44.000Z

Willow Protocol

2024-01-17T12:27:50.000Z

Post-mortem for last week's incident at Kagi

2024-01-16T21:04:39.000Z

A NetBSD/amd64 guest can now boot in 40ms

2024-01-16T20:42:48.000Z

Going declarative on macOS with Nix and Nix-Darwin

2024-01-15T19:10:34.000Z

Slashing Data Transfer Costs in AWS by 99%

2024-01-15T08:19:13.000Z

NixOS: Declarative Builds and Deployments

2024-01-14T22:27:47.000Z

Skiff: Various Privacy Failures

2024-01-14T19:40:35.000Z

Project Bluefin: an immutable, developer-focused, Cloud-native Linux

2024-01-14T17:23:30.000Z

The Ultimate Docker Cheat Sheet

2024-01-14T09:45:24.000Z

Exploring Podman: A More Secure Docker Alternative

2024-01-13T17:00:27.000Z

Triplit: Open-source DB that syncs data between server and browser in real-time

2024-01-13T05:13:46.000Z

AWS Libcrypto for Rust

2024-01-13T00:11:00.000Z

Exploring Podman: A More Secure Docker Alternative

2024-01-12T11:28:36.000Z

CLI user experience case study

2024-01-12T11:01:47.000Z

Statement regarding the ongoing Sourcehut outage

2024-01-12T09:43:52.000Z

A Love Letter to Tinkerable Software

2024-01-11T23:42:25.000Z

Ask HN: How can I make local dev with containers hurt less?

2024-01-11T21:48:29.000Z

Removing data transfer fees when moving off Google Cloud

2024-01-11T15:49:27.000Z

Exploring the Gleam FFI

2024-01-11T12:08:00.000Z

DynamoDB Foreign Data Wrapper for PostgreSQL

2024-01-11T04:48:40.000Z

Metahead – An enterprise-grade, Git-based metarepo

2024-01-10T21:27:28.000Z

British Post Office Scandal

2024-01-10T08:29:26.000Z

Ibogaine banishes PTSD, small study finds

2024-01-09T22:15:57.000Z

Linux 6.8 Network Optimizations Can Boost TCP Performance by ~40%

2024-01-09T21:32:18.000Z

Rabbit: LLM-First Mobile Phone

2024-01-09T18:44:17.000Z

Flakes aren't real and cannot hurt you: using Nix flakes the non-flake way

2024-01-09T18:05:56.000Z

Discord Serves 15M Users on One Server

2024-01-09T17:12:31.000Z

Today we celebrate by announcing that Elixir is a gradually typed language

2024-01-08T16:29:43.000Z

Elixir is now a gradually typed language

2024-01-08T16:29:43.000Z

Dive: A tool for exploring a Docker image, layer contents and more

2024-01-08T15:35:38.000Z

Adventures of Linux Userspace at Meta

2024-01-08T00:57:42.000Z

Optimizing the unoptimizable: a journey to faster C++ compile times

2024-01-06T19:21:06.000Z

Sentry new TOS to use data to train AI with no opt-out

2024-01-05T08:48:58.000Z

Modern Java/JVM Build Practices

2024-01-05T03:18:55.000Z

Go: What we got right, what we got wrong

2024-01-04T21:16:08.000Z

How Open ID Connect Works

2024-01-04T16:03:47.000Z

Performance Benchmarks of Cloud Virtual Machines

2024-01-03T18:54:54.000Z

Gvisor: Application Kernel for Containers

2024-01-03T03:50:29.000Z

Blueprint health protocol – Bryan Johnson (founder braintree/Venmo)

2024-01-02T08:31:16.000Z

Blueprint health protocol

2024-01-02T08:31:16.000Z

Displaying Content as a Graph

2024-01-01T19:52:04.000Z

Ant Completes Process of Removing Jack Ma's Control

2024-01-01T17:23:42.000Z

Bazzite – a SteamOS-like OCI image for desktop, living room, and handheld PCs

2023-12-31T22:21:56.000Z

O(1) Build File

2023-12-31T21:17:44.000Z

Mazzle – A Pipelines as Code Tool

2023-12-31T11:15:08.000Z

Mazzle – A pipelines as code tool

2023-12-31T11:15:08.000Z

Asking your customers what they want doesn't work

2023-12-30T10:30:33.000Z

Tracking developer build times to decide if the M3 MacBook is worth upgrading

2023-12-29T19:57:58.000Z

Using Alpine can make Python Docker builds 50× slower

2023-12-28T20:31:14.000Z

Lode Runner (HTML5 Remake)

2023-12-28T12:24:06.000Z

37C3 Video List

2023-12-28T10:03:56.000Z

You've just been fucked by psyops [video]

2023-12-28T09:17:46.000Z

FUSE-T is a kext-less implementation of FUSE for macOS that uses NFSv4

2023-12-26T20:13:48.000Z

Show HN: Just.sh – compiler that turns Justfiles into portable shell scripts

2023-12-26T14:30:05.000Z

Obsidian 1.5 Desktop (Public)

2023-12-26T13:24:41.000Z

The life and death of open source companies

2023-12-26T04:45:59.000Z

If only someone told me this before my first startup

2023-12-24T17:54:36.000Z

Hbomberguy didn't want to make that 4-hour plagiarism video

2023-12-23T16:56:42.000Z

Otter, Fastest Go in-memory cache based on S3-FIFO algorithm

2023-12-23T15:49:21.000Z

Paul Biggar removed from CircleCI board for pro-Palestine blog post

2023-12-23T13:18:50.000Z

How Pinterest scaled

2023-12-23T08:51:55.000Z

How I Obtained a Business Manager Visa in Japan

2023-12-22T11:20:28.000Z

What Is OIDC?

2023-12-22T01:20:12.000Z

Deep Cloning Objects in JavaScript, the Modern Way

2023-12-21T23:13:25.000Z

WASM Container Orchestration with Visual Flows

2023-12-21T15:34:08.000Z

Cisco to acquire Isovalent

2023-12-21T14:15:06.000Z

eBPF Networking Techniques – Packet Redirection

2023-12-21T13:18:30.000Z

How to Escape a Container

2023-12-20T22:40:14.000Z

Netlify's disingenuous survey-based attack on Next.js (and eleventy, too)

2023-12-20T14:04:05.000Z

Trying chDB, an embeddable ClickHouse engine

2023-12-20T13:31:30.000Z

Show HN: Wave – Modern Open-Source Terminal (macOS and Linux)

2023-12-19T21:29:32.000Z

Did the cloud made us over-engineer some systems that could have been simpler?

2023-12-19T20:07:16.000Z

Jepsen: MySQL 8.0.34

2023-12-19T14:17:25.000Z

Application Traffic with eBPF

2023-12-19T13:48:11.000Z

Fly Kubernetes

2023-12-18T17:21:34.000Z

Behind the scenes scaling ChatGPT and the OpenAI APIs [video] - Eng Mgr @ OpenAI

2023-12-18T12:22:36.000Z

Maybe We Don’t Need UUIDv7 After All

2023-12-17T11:37:41.000Z

S3 Express One Zone, not quite what I hoped for

2023-12-16T21:29:09.000Z

SSH3: SSHv2 using HTTP/3 and QUIC

2023-12-16T15:06:42.000Z

Show HN: I've built a MySQL proxy that supports online DDL

2023-12-16T05:03:34.000Z

Mfio – Completion I/O for Rust

2023-12-15T18:06:04.000Z

NixOS has one fatal flaw

2023-12-15T17:31:13.000Z

We clone a running VM in 2 seconds

2023-12-15T07:13:47.000Z

Mitchell reflects as he departs HashiCorp

2023-12-14T21:27:17.000Z

Ubiquiti showing other users' consoles

2023-12-14T16:42:06.000Z

Biscuit Authorization

2023-12-13T23:18:57.000Z

Ubuntu 24.04 LTS will enable frame pointers by default

2023-12-13T16:23:01.000Z

Edge Functions: Node and native NPM compatibility

2023-12-12T18:59:38.000Z

Trying chDB, an embeddable ClickHouse engine

2023-12-12T12:39:05.000Z

Epic vs. Google: Google Loses

2023-12-12T00:21:07.000Z

John Carmack and John Romero reunited to talk DOOM on its 30th Anniversary

2023-12-11T01:14:13.000Z

Trippy – A Network Diagnostic Tool

2023-12-10T03:46:43.000Z

Terraform module for scalable GitHub action runners on AWS

2023-12-09T05:03:02.000Z

HashiCorp Vault Forked into OpenBao

2023-12-09T03:27:10.000Z

Show HN: WarpBuild – x86-64 and arm GitHub Action runners for 30% faster builds

2023-12-08T16:11:20.000Z

Choose your own IP

2023-12-07T17:29:50.000Z

Spotlight: Sentry for Development

2023-12-06T17:03:27.000Z

Rethinking Serverless with Flame

2023-12-06T12:03:39.000Z

Nix Super

2023-12-05T18:15:13.000Z

Switch off bad TV settings

2023-12-04T16:08:32.000Z

Writing a file system from scratch in Rust

2023-12-04T00:20:22.000Z

Adding Build Provenance to Homebrew

2023-12-03T02:16:55.000Z

Clang now makes binaries an original Pi B+ can't run

2023-12-03T02:07:21.000Z

Show HN: Fluvio – Distributed stream processing system written in Rust and WASM

2023-12-03T00:35:38.000Z

JSON Web Proofs

2023-12-02T22:36:27.000Z

Ask HN: What are some unpopular technologies you wish people knew more about?

2023-12-02T15:16:38.000Z

Easy to use OpenID Connect client and server library written for Go

2023-12-02T05:48:10.000Z

GCP Incidents

2023-12-02T02:14:51.000Z

I never got a response after Google assured me that they'd do a full retro

2023-12-01T19:03:28.000Z

SQLSync – Stop Building Databases

2023-12-01T17:19:28.000Z

Easy Stable Diffusion XL in your device, offline

2023-12-01T14:34:50.000Z

How pgroll works under the hood

2023-11-30T19:44:13.000Z

In OpenZFS and Btrfs, everyone was just guessing

2023-11-30T11:13:51.000Z

Easily deploy SaaS products with new Quick Launch in AWS Marketplace

2023-11-30T00:05:43.000Z

Today we are excited to announce the general availability of SaaS Quick Launch, a new feature in AWS Marketplace that makes it easy and secure to deploy SaaS products.

Before SaaS Quick Launch, configuring and launching third-party SaaS products could be time-consuming and costly, especially in certain categories like security and monitoring. Some products require hours of engineering time to manually set up permissions policies and cloud infrastructure. Manual multistep configuration processes also introduce risks when buyers rely on unvetted deployment templates and instructions from third-party resources.

SaaS Quick Launch helps buyers make the deployment process easy, fast, and secure by offering step-by-step instructions and resource deployment using preconfigured AWS CloudFormation templates. The software vendor and AWS validate these templates to ensure that the configuration adheres to the latest AWS security standards.

Getting started with SaaS Quick Launch
It’s easy to find which SaaS products have Quick Launch enabled when you are browsing in AWS Marketplace. Products that have this feature configured have a Quick Launch tag in their description.

After completing the purchase process for a Quick Launch–enabled product, you will see a button to set up your account. That button will take you to the Configure and launch page, where you can complete the registration to set up your SaaS account, deploy any required AWS resources, and launch the SaaS product.

The first step ensures that your account has the required AWS permissions to configure the software.

The second step involves configuring the vendor account, either to sign in to an existing account or to create a new account on the vendor website. After signing in, the vendor site may pass essential keys and parameters that are needed in the next step to configure the integration.

The third step allows you to configure the software and AWS integration. In this step, the vendor provides one or more CloudFormation templates that provision the required AWS resources to configure and use the product.

The final step is to launch the software once everything is configured.

Availability
Sellers can enable this feature in their SaaS product. If you are a seller and want to learn how to set this up in your product, check the Seller Guide for detailed instructions.

To learn more about SaaS in AWS Marketplace, visit the service page and view all the available SaaS products currently in AWS Marketplace.

— Marcia

Llamafile lets you distribute and run LLMs with a single file

2023-11-29T19:29:49.000Z

Reddit Sans

2023-11-29T19:22:28.000Z

Deno Cron

2023-11-29T16:03:10.000Z

An ex-Googler's guide to dev tools (2020)

2023-11-29T04:16:08.000Z

Darling: Run macOS Software on Linux

2023-11-26T17:52:46.000Z

The Curse of Docker

2023-11-26T01:14:03.000Z

Fat OCI images are a cultural problem

2023-11-25T01:56:45.000Z

Zero-Downtime Live Migration of Stateful VMs on Kubernetes

2023-11-24T22:36:52.000Z

Dockerfile ARG footgun

2023-11-24T17:51:02.000Z

@fastify/vite: Titans Combined

2023-11-24T16:00:39.000Z

BBC Outage

2023-11-24T00:05:48.000Z

I discovered caching CDNs were throttling my everyday browsing

2023-11-23T12:15:09.000Z

Streamlining CI/CD Pipelines with Code

2023-11-21T18:41:31.000Z

Easylkb: Easy Linux Kernel Builder

2023-11-21T12:42:20.000Z

Sam Altman, Greg Brockman and others to join Microsoft

2023-11-20T07:56:13.000Z

Trap and test AWS SES emails locally

2023-11-20T01:38:46.000Z

The Inter font family version 4.0

2023-11-20T01:04:47.000Z

Terraform Cloud Pricing Changes Sticker Shock

2023-11-19T16:23:27.000Z

Berkeley Mono Typeface

2023-11-18T18:32:03.000Z

Show HN: Etcha – Infinite scale, serverless config management

2023-11-18T15:39:55.000Z

Practical nil panic detection for Go

2023-11-17T06:59:28.000Z

An automatic indexing system for Postgres

2023-11-17T06:38:18.000Z

Neat GitHub Actions patterns for GitHub Merge Queues

2023-11-17T00:35:00.000Z

Planning for Unplanned Work in Linear

2023-11-16T19:58:29.000Z

Moving from AWS to Bare-Metal saved us $230k per year

2023-11-16T19:54:59.000Z

Running Redshift at Scale

2023-11-15T18:48:10.000Z

Inko Programming Language

2023-11-14T21:47:13.000Z

Bpftime: Userspace eBPF runtime for fast Uprobe and Syscall hook and Plugins

2023-11-14T20:10:36.000Z

We've learned nothing from the SolarWinds hack

2023-11-13T21:56:51.000Z

GitHub Actions Are a Problem

2023-11-12T14:22:19.000Z

Serverless at Scale: Lessons from 200M Lambda Invocations

2023-11-11T03:39:59.000Z

Weird debugging tricks the browser doesn’t want you to know

2023-11-11T01:35:17.000Z

System Transparency: a security architecture for bare-metal servers

2023-11-10T22:59:54.000Z

Texture Healing for Monospace Fonts

2023-11-10T17:07:31.000Z

Cursorless is alien magic from the future

2023-11-10T04:04:10.000Z

Monaspace

2023-11-09T20:16:34.000Z

Ex-Kotaku staff go independent and launch Aftermath

2023-11-07T12:33:03.000Z

Show HN: WireHub – easily create and share WireGuard networks

2023-11-05T20:54:33.000Z

Show HN: Sshx, a web-based collaborative terminal

2023-11-05T15:44:23.000Z

Ask HN: How would French police locate suspects by tapping their devices?

2023-11-04T16:52:46.000Z

Spin 2.0 – open-source tool for building and running WASM apps

2023-11-04T12:01:50.000Z

Charm has raised $6M in funding

2023-11-03T08:46:30.000Z

From S3 to R2: An economic opportunity

2023-11-02T19:15:45.000Z

Sentry: From the Beginning

2023-11-01T11:35:31.000Z

Show HN: Streamdal – an open-source tail -f for your data

2023-10-31T15:31:33.000Z

Should you use a Lambda Monolith, a.k.a. Lambdalith, for your API?

2023-10-31T09:29:41.000Z

Migrating our backend from Vercel to Fly.io

2023-10-29T20:50:26.000Z

NixOS Reproducible Builds: minimal ISO successfully independently rebuilt

2023-10-29T11:41:05.000Z

Supabase (YC S20) Is Hiring a Technical Product Marketer (Fully Remote)

2023-10-29T07:01:40.000Z

Raven – CI/CD Security Analyzer

2023-10-29T05:23:57.000Z

Elixir and Phoenix can do it all

2023-10-28T19:57:16.000Z

US asks Qatar to 'turn down the volume' of Al Jazeera news coverage

2023-10-27T21:15:55.000Z

Show HN: Pākiki Proxy – An intercepting proxy for penetration testing

2023-10-27T15:35:49.000Z

Grammarly's OAuth Mistakes

2023-10-27T15:31:33.000Z

Next.js 14

2023-10-26T17:02:09.000Z

Oxide: The Cloud Computer

2023-10-26T10:43:59.000Z

I Won't Use Next.js

2023-10-25T20:52:10.000Z

I think GCP is better than AWS (2020)

2023-10-25T19:01:54.000Z

Show HN: Orbital – Dynamically unifying APIs and data with no glue code

2023-10-25T12:59:39.000Z

Flawless – Durable execution engine for Rust

2023-10-25T07:39:11.000Z

In the Works – AWS European Sovereign Cloud

2023-10-25T05:06:17.000Z

The AWS European Sovereign Cloud will allow government agencies, regulated industries, and the independent software vendors (ISVs) that support them to store sensitive data and run critical workloads on AWS infrastructure that is operated and supported by AWS employees located in and residents of the European Union (EU). The first Region will be located in Germany.

Background
Late last year we announced the AWS Digital Sovereignty Pledge and made a commitment to offer you (and all AWS customers) the most advanced set of sovereignty controls and features available in the cloud. Since that announcement we have taken several important steps forward in fulfillment of that pledge:

May 2023 – We announced that AWS Nitro System had been validated by an independent third-party to confirm that it contains no mechanism that allows anyone at AWS to access your data on AWS hosts. At the same time we announced that the AWS Key Management Service (KMS) External Key Store allows you to store keys outside of AWS and use them to encrypt data stored in AWS.

August 2023 – We announced AWS Dedicated Local Zones, infrastructure that is fully managed by AWS and built for exclusive use by a customer or community, and placed in a customer-specified location or data center.

AWS European Sovereign Cloud
The upcoming AWS European Sovereign Cloud will be separate from, and independent of, the eight existing AWS Regions already open in Frankfurt, Ireland, London, Milan, Paris, Stockholm, Spain, and Zurich. It will give you additional options for deployment, while providing AWS services, APIs, and tools that you are already familiar with. The design will help you meet your data residency, operational autonomy, and resiliency needs.

In order to maintain separation between this cloud and the existing AWS Global Cloud you will need to create a fresh AWS account. The metadata you create such as data labels, categories, permissions, and configurations will be stored within the EU. This does not apply to AWS account information such as spend and billing data, which will be aggregated and used to ensure that you get favorable pricing within any applicable volume usage tiers.

As I mentioned earlier, this cloud will be operated and supported by AWS employees located in and residents of the EU, with support available 24/7/365.

The AWS European Sovereign Cloud will be operationally independent of the other regions, with separate in-Region billing and usage metering systems.

Initial Region
The initial region will be located in Germany. It will launch with multiple Availability Zones, each in separate and distinct geographic locations, with enough distance between them to significantly reduce the risk of a single event impacting your business continuity. We will have additional details on the list of available services, instance types, and so forth as we get closer to the launch.

Over time, this and other regions in this cloud will also function as parent regions for AWS Outposts and Dedicated Local Zones. These options give you even more flexibility with regard to isolation and in-country data residency. If you would like to express your interest in Dedicated Local Zones in your country, please contact your AWS account manager.

Get Ready
You can start to build applications today in any of the existing regions and move them to the AWS European Sovereign Cloud when the region launches. You can also initiate conversations with your local regulatory authorities in order to better understand any issues that are specific to your particular location.

— Jeff;

Gittuf – a security layer for Git

2023-10-24T19:31:59.000Z

Writerside – a new technical writing environment from JetBrains

2023-10-24T03:57:12.000Z

1Password detects "suspicious activity" in its internal Okta account

2023-10-24T00:55:56.000Z

Introducing MSW 2.0

2023-10-23T14:01:47.000Z

MSW 2.0 – Mock Service Worker

2023-10-23T14:01:47.000Z

Ask HN: Was any Starfighter postmortem ever published?

2023-10-23T13:39:47.000Z

Build farm visualizations

2023-10-23T11:26:47.000Z

Microsoft Account's OAuth tokens leaking via open redirect in Harvest

2023-10-22T09:15:15.000Z

Hetzner does run a (MitM) proxy in front of my server

2023-10-20T21:01:04.000Z

How to mitigate the Hetzner/Linode XMPP.ru MitM interception incident

2023-10-20T20:31:29.000Z

Encrypted traffic interception on Hetzner and Linode targeting Jabber service

2023-10-20T12:40:53.000Z

Using Temporal to Scale Data Synchronization at PeerDB

2023-10-19T19:46:46.000Z

Just paying Figma because nothing else works

2023-10-19T17:58:01.000Z

Linux runtime security agent powered by eBPF

2023-10-19T13:42:59.000Z

Ask HN: Is anyone using Cloud Dev Environments (e.g. Codespaces/Replit) at work?

2023-10-18T20:41:05.000Z

Ask HN: Is anyone using cloud dev environments (e.g. Codespaces/Replit) at work?

2023-10-18T20:41:05.000Z

BPF Tailcalls

2023-10-18T19:23:41.000Z

AI Graphics at JetBrains

2023-10-17T22:08:33.000Z

Piped – An alternative privacy-friendly YouTube front end

2023-10-17T08:19:28.000Z

Building cross-cloud identity federation in Go for secure data sharing

2023-10-16T15:55:59.000Z

Invidious – An open source alternative front-end to YouTube

2023-10-16T14:38:36.000Z

Ask HN: What is your experience with Nano-Hydroxyapatite toothpaste?

2023-10-15T11:01:25.000Z

Cloudflare Sippy: Incrementally Migrate Data from AWS S3 to Reduce Egress Fees

2023-10-15T08:55:12.000Z

Vercel employee used customer information to pursue a personal trademark matter

2023-10-15T05:10:49.000Z

Google Cloud Spanner is now half the cost of Amazon DynamoDB

2023-10-11T17:25:40.000Z

Buildkite has quietly removed their $9 "Team" plan

2023-10-11T07:34:22.000Z

Ask HN: SaaS Founders, What 3 advice would you give your younger selves?

2023-10-10T17:26:37.000Z

Introducing Pulumi ESC: Easy and Secure Environments, Secrets and Configuration

2023-10-10T13:00:07.000Z

The Tailscale Universal Docker Mod

2023-10-08T16:51:28.000Z

Scaling Knative to 100K+ Webapps

2023-10-08T16:21:31.000Z

Initial release of Incus, the LXD community fork

2023-10-07T21:48:45.000Z

Looking back on SaaS product strategy

2023-10-06T23:12:14.000Z

Ask HN: Sales Tips for Solo Devs

2023-10-06T15:44:33.000Z

Rails 7.1 Released

2023-10-06T04:31:36.000Z

OpenPubKey and Sigstore

2023-10-06T03:12:45.000Z

2023 State of DevOps Report: Culture is everything

2023-10-05T11:00:03.000Z

In the face of rapid digital transformation, a positive organizational culture and user-centric design are the backbone of successful software delivery. And while Artificial Intelligence (AI) is the center of so many contemporary technical conversations, the impact of AI development tools on teams is still in its infancy.

These are just some of the findings from the 2023 Accelerate State of DevOps Report, the annual report from Google Cloud’s DevOps Research and Assessment (DORA) team.

For nine years, the State of DevOps survey has assembled data from more than 36,000 professionals worldwide, making it the largest and longest-running research of its kind. This year, we took a deep dive into how high-performing DevOps performers bake these technical, process, and cultural capabilities into their development practices to drive success. Specifically, we explored three key outcomes of a having a DevOps practice and the capabilities that contribute to achieving them:

Organizational performance - generating value for customers and community
Team performance - empowering teams to innovate and collaborate
Employee well-being - reducing burnout and increasing satisfaction/productivity

This year, we were working with a particularly robust data set: the total number of organic respondents increased by 3.6x compared to last year, allowing us to perform a deeper analysis of the relationship between ways of working and outcomes. Thank you to everyone who took the survey this year!

Measuring software delivery performance

Our research shows that an organization’s level of software delivery performance predicts overall performance, team performance, and employee well-being. In turn, we use the following measures to understand the throughput and stability of software changes:

Change lead time: how long it takes a code change to go from committed to deployed
Deployment frequency: how frequently changes are pushed to production
Change failure rate: how frequently a software deployment introduces a failure that requires immediate intervention
Failed deployment recovery time: how long it takes to recover from a failed deployment

Our analysis revealed four performance levels, including the return of the Elite performance level, which we did not detect in last year’s cohort. Elite performers around the world are able to achieve both throughput and stability.

Five key insights

There are several key takeaways for teams who want to understand how to improve their software delivery capabilities. Here are some of the key insights from this year’s report:

1. Establish a healthy culture

Culture is foundational to building technical capabilities, igniting technical performance, reaching organizational performance goals, and helping employees be successful. A healthy culture can help reduce burnout, increase productivity, and increase job satisfaction. Teams with generative cultures, composed of people who felt included and like they belonged on their team, have 30% higher organizational performance than organizations without a generative culture.

The aspects of culture that can improve employee well-being

2. Build with users in mind

Teams can deploy as fast and successfully as they'd like, but without the user in mind, it might be for naught. Our research shows that a user-centric approach to building applications and services is one of the strongest predictors of overall organizational performance. In fact, building with the user in mind appears to inform and drive improvements across all of the technical, process, and cultural capabilities we explore in the DORA research. Teams that focus on the user have 40% higher organizational performance than teams that don’t.

3. Amplify technical capabilities with quality documentation

High-quality documentation amplifies the impact that DevOps technical capabilities (for example, continuous integration and trunk-based development) have on organizational performance. This means that quality documentation not only helps establish these technical capabilities, but helps them matter. For example, SRE practices are estimated to have 1.4x more impact on organizational performance when high-quality documentation is in place. Overall, high-quality documentation leads to 25% higher team performance relative to low-quality documentation.

4. Distribute work fairly

People who identify as underrepresented and women or those who chose to self-describe their gender have higher levels of burnout. There are likely multiple systemic and environmental factors that cause this. Unsurprisingly, we find that respondents who take on more repetitive work are more likely to experience higher levels of burnout, and members of underrepresented groups are more likely to take on more repetitive work: Underrepresented respondents report 24% more burnout than those who are not underrepresented. Underrepresented respondents do 29% more repetitive work than those who are not underrepresented.And women or those who self-reported their gender do 40% more repetitive work than men.

5. Increase infrastructure flexibility with cloud

Teams can get the most value out of the cloud by leveraging the characteristics of cloud like rapid elasticity and on-demand self-service. These characteristics predict a more flexible infrastructure. Using a public cloud, for example, leads to a 22% increase in infrastructure flexibility relative to not using the cloud. This flexibility, in turn, leads to teams with 30% higher organizational performance than those with inflexible infrastructures.

AI: we're just getting started

There is a lot of enthusiasm about the potential of AI development tools. We saw this in this year’s results — in fact a majority of respondents are incorporating at least some AI into the tasks we included in our survey. But we anticipate that it will take some time for AI-powered tools to come into widespread and coordinated use in the industry. We are very interested in seeing how adoption grows over time and the impact that growth will have on performance measures and outcomes that are important to organizations. Here’s where we are seeing the adoption of AI tools today:

Applying insights from DORA in your context

The key takeaway from DORA’s research is that high performance requires continuous improvement. Regularly measure outcomes across your organization, teams, and employees. Identify areas for optimization and make incremental changes to dial up performance.

Don't let these insights sit on a shelf — put them into action. Contextualize the findings based on your team's current practices and pain points. Have open conversations about your bottlenecks. Comparing your metrics year-over-year is more meaningful than comparing yourself to other companies. Sustainable success comes from repeatedly finding and fixing your weaknesses. DORA's framework can help you determine which capabilities to focus on next for the biggest performance boost.

We hope the Accelerate State of DevOps Report helps organizations of all sizes, industries, and regions improve their DevOps capabilities, and we look forward to hearing your thoughts and feedback. To learn more about the report and implementing DevOps with Google Cloud:

Download the full report.
Measure your team’s software delivery performance in less than a minute using DORA's DevOps Quick Check.
Model your organization around the DevOps capabilities of elite-performing teams.
Share your experiences, learn from others, and get inspiration by joining the DORA community.

The Workflow Pattern

2023-10-05T08:17:21.000Z

Show HN: Shuttle – Build and ship backends without writing infrastructure files

2023-10-04T11:51:17.000Z

Kata Containers: The speed of containers, the security of VMs

2023-10-04T08:24:23.000Z

Pgroll: zero-downtime, undoable, schema migrations for Postgres

2023-10-03T14:20:56.000Z

Dalle-3 Examples

2023-10-02T18:30:54.000Z

Floorp – a customisable Firefox fork from Japan

2023-10-02T06:52:28.000Z

Pg_branch: Experimental Postgres extension brings Neon-like branching

2023-10-02T03:49:35.000Z

Goodbye integers, hello UUIDv7

2023-10-02T01:44:59.000Z

2023 DevOps Is Terrible

2023-10-01T18:29:54.000Z

DKIM: Rotate and Publish Your Keys

2023-10-01T07:44:29.000Z

Ask HN: SaaS pricing pages with high prices and not “contact sales”

2023-09-30T17:29:49.000Z

Insomnium – Local, privacy-focused fork of Insomnia API client

2023-09-29T18:49:16.000Z

MMO Architecture: Source of truth, Dataflows, I/O bottlenecks and how to solve

2023-09-29T12:09:04.000Z

Burning money on paid ads for a dev tool – what we've learned

2023-09-29T08:30:03.000Z

Docker Hub Registry timming out

2023-09-28T20:12:31.000Z

Running Serverless Puppeteer with Workers and Durable Objects

2023-09-28T13:00:38.000Z

Last year, we announced the Browser Rendering API – letting users running Puppeteer, a browser automation library, directly in Workers. Puppeteer is one of the most popular libraries used to interact with a headless browser instance to accomplish tasks like taking screenshots, generating PDFs, crawling web pages, and testing web applications. We’ve heard from developers that configuring and maintaining their own serverless browser automation systems can be quite painful.

The Workers Browser Rendering API solves this. It makes the Puppeteer library available directly in your Worker, connected to a real web browser, without the need to configure and manage infrastructure or keep browser sessions warm yourself. You can use @cloudflare/puppeteer to run the full Puppeteer API directly on Workers!

We’ve seen so much interest from the developer community since launching last year. While the Browser Rendering API is still in beta (sign up to our waitlist to get access), we wanted to share a way to get more out of our current limits by using the Browser Rendering API with Durable Objects. We’ll also be sharing pricing for the Rendering API, so you can build knowing exactly what you’ll pay for.

Building a responsive web design testing tool with the Browser Rendering API

As a designer or frontend developer, you want to make sure that content is well-designed for visitors browsing on different screen sizes. With the number of possible devices that users are browsing on are growing, it becomes difficult to test all the possibilities manually. While there are many testing tools on the market, we want to show how easy it is to create your own Chromium based tool with the Workers Browser Rendering API and Durable Objects.

We’ll be using the Worker to handle any incoming requests, pass them to the Durable Object to take screenshots and store them in an R2 bucket. The Durable Object is used to create a browser session that’s persistent. By using Durable Object Alarms we can keep browsers open for longer and reuse browser sessions across requests.

Let’s dive into how we can build this application:

Create a Worker with a Durable Object, Browser Rendering API binding and R2 bucket. This is the resulting wrangler.toml:

name = "rendering-api-demo"
main = "src/index.js"
compatibility_date = "2023-09-04"
compatibility_flags = [ "nodejs_compat"]
account_id = "c05e6a39aa4ccdd53ad17032f8a4dc10"


# Browser Rendering API binding
browser = { binding = "MYBROWSER" }

# Bind an R2 Bucket
[[r2_buckets]]
binding = "BUCKET"
bucket_name = "screenshots"

# Binding to a Durable Object
[[durable_objects.bindings]]
name = "BROWSER"
class_name = "Browser"

[[migrations]]
tag = "v1" # Should be unique for each entry
new_classes = ["Browser"] # Array of new classes

2. Define the Worker

This Worker simply passes the request onto the Durable Object.

export default {
	async fetch(request, env) {

		let id = env.BROWSER.idFromName("browser");
		let obj = env.BROWSER.get(id);
	  
		// Send a request to the Durable Object, then await its response.
		let resp = await obj.fetch(request.url);
		let count = await resp.text();
	  
		return new Response("success");
	}
};

3. Define the Durable Object class

const KEEP_BROWSER_ALIVE_IN_SECONDS = 60;

export class Browser {
	constructor(state, env) {
		this.state = state;
		this.env = env;
		this.keptAliveInSeconds = 0;
		this.storage = this.state.storage;
	}
  
	async fetch(request) {
		// screen resolutions to test out
		const width = [1920, 1366, 1536, 360, 414]
		const height = [1080, 768, 864, 640, 896]

		// use the current date and time to create a folder structure for R2
		const nowDate = new Date()
		var coeff = 1000 * 60 * 5
		var roundedDate = (new Date(Math.round(nowDate.getTime() / coeff) * coeff)).toString();
		var folder = roundedDate.split(" GMT")[0]

		//if there's a browser session open, re-use it
		if (!this.browser) {
			console.log(`Browser DO: Starting new instance`);
			try {
			  this.browser = await puppeteer.launch(this.env.MYBROWSER);
			} catch (e) {
			  console.log(`Browser DO: Could not start browser instance. Error: ${e}`);
			}
		  }
		
		// Reset keptAlive after each call to the DO
		this.keptAliveInSeconds = 0;
		
		const page = await this.browser.newPage();

		// take screenshots of each screen size 
		for (let i = 0; i < width.length; i++) {
			await page.setViewport({ width: width[i], height: height[i] });
			await page.goto("https://workers.cloudflare.com/");
			const fileName = "screenshot_" + width[i] + "x" + height[i]
			const sc = await page.screenshot({
				path: fileName + ".jpg"
			}
			);

			this.env.BUCKET.put(folder + "/"+ fileName + ".jpg", sc);
		  }
		
		// Reset keptAlive after performing tasks to the DO.
		this.keptAliveInSeconds = 0;

		// set the first alarm to keep DO alive
		let currentAlarm = await this.storage.getAlarm();
		if (currentAlarm == null) {
		console.log(`Browser DO: setting alarm`);
		const TEN_SECONDS = 10 * 1000;
		this.storage.setAlarm(Date.now() + TEN_SECONDS);
		}
		
		await this.browser.close();
		return new Response("success");
	}

	async alarm() {
		this.keptAliveInSeconds += 10;
	
		// Extend browser DO life
		if (this.keptAliveInSeconds < KEEP_BROWSER_ALIVE_IN_SECONDS) {
		  console.log(`Browser DO: has been kept alive for ${this.keptAliveInSeconds} seconds. Extending lifespan.`);
		  this.storage.setAlarm(Date.now() + 10 * 1000);
		} else console.log(`Browser DO: cxceeded life of ${KEEP_BROWSER_ALIVE_IN_SECONDS}. Browser DO will be shut down in 10 seconds.`);
	  }

  }

That’s it! With less than a hundred lines of code, you can fully customize a powerful tool to automate responsive web design testing. You can even incorporate it into your CI pipeline to automatically test different window sizes with each build and verify the result is as expected by using an automated library like pixelmatch.

How much will this cost?

We’ve spoken to many customers deploying a Puppeteer service on their own infrastructure, on public cloud containers or functions or using managed services. The common theme that we’ve heard is that these services are costly – costly to maintain and expensive to run.

While you won’t be billed for the Browser Rendering API yet, we want to be transparent with you about costs you start building. We know it’s important to understand the pricing structure so that you don’t get a surprise bill and so that you can design your application efficiently.

You pay based on two usage metrics:

Number of sessions: A Browser Session is a new instance of a browser being launched
Number of concurrent sessions: Concurrent Sessions is the number of browser instances open at once

Using Durable Objects to persist browser sessions improves performance by eliminating the time that it takes to spin up a new browser session. Since it re-uses sessions, it cuts down on the number of concurrent sessions needed. We highly encourage this model of session re-use if you expect to see consistent traffic for applications that you build on the Browser Rendering API.

If you have feedback about this pricing, we’re all ears. Feel free to reach out through Discord (channel name: browser-rendering-api-beta) and share your thoughts.

Get Started

Sign up to our waitlist to get access to the Workers Browser Rendering API. We’re so excited to see what you build! Share your creations with us on Twitter/X @CloudflareDev or on our Discord community.

macOS Containers v0.0.1

2023-09-26T07:01:45.000Z

Quadlets might make me finally stop using Docker-compose – Major Hayden

2023-09-26T05:39:19.000Z

Ask HN: Is “Distributed CI” Possible?

2023-09-25T19:25:28.000Z

Tell HN: There is a highlights page on HN

2023-09-25T02:17:39.000Z

Choose Postgres queue technology

2023-09-24T20:30:01.000Z

TypeScript NPM Packages Done Right

2023-09-24T06:28:30.000Z

Semgrep: a static analysis journey

2023-09-23T23:49:50.000Z

Bottlerocket – Minimal, immutable Linux OS with verified boot

2023-09-23T19:44:56.000Z

GitHub Actions could be so much better

2023-09-22T14:21:22.000Z

Tailscale Kubernetes Operator

2023-09-22T12:39:06.000Z

11 Principles for building and scaling feature flag systems

2023-09-22T12:31:12.000Z

Generative AI's Act Two

2023-09-20T19:02:24.000Z

Strada – Create fully native controls, driven by your web app

2023-09-20T18:00:26.000Z

OpenTF renames itself to OpenTofu

2023-09-20T06:44:42.000Z

LXD: Containers for Human Beings

2023-09-19T22:01:36.000Z

Show HN: Graphite – Stacked Diffs on GitHub

2023-09-19T15:03:24.000Z

Creating a base OCI image for Nix flake builds within Gitea/Forgejo

2023-09-18T16:13:54.000Z

How FoundationDB works and why it works (2021)

2023-09-18T04:00:56.000Z

Subdomain.center – discover all subdomains for a domain

2023-09-16T03:32:27.000Z

Show HN: Hello Inbox – Free email deliverability checklist for marketers

2023-09-16T01:18:52.000Z

We built the fastest CI and it failed

2023-09-12T14:07:08.000Z

LogoScale – A method for vectorizing small, crappy logos

2023-09-08T23:38:39.000Z

Bun v1.0.0

2023-09-08T14:41:03.000Z

Show HN: Rivet – open-source AI Agent dev env with real-world applications

2023-09-08T13:29:05.000Z

PgBouncer is useful, important, and fraught with peril

2023-09-08T09:43:22.000Z

Napi: Build compiled Node.js add-ons in Rust

2023-09-08T07:27:12.000Z

Linux network performance parameters

2023-09-06T11:56:35.000Z

How Vercel Shipped Cron Jobs in 2 Months Using Amazon EventBridge Scheduler

2023-09-05T20:04:30.000Z

Vercel implemented Cron Jobs using Amazon EventBridge Scheduler, enabling their customers to create, manage, and run scheduled tasks at scale. The adoption of this feature was rapid, reaching over 7 million weekly cron invocations within a few months of release. This article shows how they did it and how they handle the massive scale they’re experiencing.

Vercel builds a front-end cloud that makes it easier for engineers to deploy and run their front-end applications. With more than 100 million deployments in Vercel in the last two years, Vercel helps users take advantage of best-in-class AWS infrastructure with zero configuration by relying heavily on serverless technology. Vercel provides a lot of features that help developers host their front-end applications. However, until the beginning of this year, they hadn’t built Cron Jobs yet.

A cron job is a scheduled task that automates running specific commands or scripts at predetermined intervals or fixed times. It enables users to set up regular, repetitive actions, such as backups, sending notification emails to customers, or processing payments when a subscription needs to be renewed. Cron jobs are widely used in computing environments to improve efficiency and automate routine operations, and they were a commonly requested feature from Vercel’s customers.

In December 2022, Vercel hosted an internal hackathon to foster innovation. That’s where Vincent Voyer and Andreas Schneider joined forces to build a prototype cron job feature for the Vercel platform. They formed a team of five people and worked on the feature for a week. The team worked on different tasks, from building a user interface to display the cron jobs to creating the backend implementation of the feature.

Amazon EventBridge Scheduler
When the hackathon team started thinking about solving the cron job problem, their first idea was to use Amazon EventBridge rules that run on a schedule. However, they realized quickly that this feature has a limit of 300 rules per account per AWS Region, which wasn’t enough for their intended use. Luckily, one of the team members had read the announcement of Amazon EventBridge Scheduler in the AWS Compute blog and they thought this would be a perfect tool for their problem.

By using EventBridge Scheduler, they could schedule one-time or recurrently millions of tasks across over 270 AWS services without provisioning or managing the underlying infrastructure.

For creating a new cron job in Vercel, a customer needs to define the frequency in which this task will run and the API they want to invoke. Vercel, in the backend, uses EventBridge Scheduler and creates a new schedule when a new cron job is created.

To call the endpoint, the team used an AWS Lambda function that receives the path that needs to be invoked as input parameters.

When the time comes for the cron job to run, EventBridge Scheduler invokes the function, which then calls the customer website endpoint that was configured.

By the end of the week, Vincent and his team had a working prototype version of the cron jobs feature, and they won a prize at the hackathon.

Building Vercel Cron Jobs
After working for one week on this prototype in December, the hackathon ended, and Vincent and his team returned to their regular jobs. In early January 2023, Vincent and the Vercel team decided to take the project and turn it into a real product.

During the hackathon, the team built the fundamental parts of the feature, but there were some details that they needed to polish to make it production ready. Vincent and Andreas worked on the feature, and in less than two months, on February 22, 2023, they announced Vercel Cron Jobs to the public. The announcement tweet got over 400 thousand views, and the community loved the launch.

The adoption of this feature was very rapid. Within a few months of launching Cron Jobs, Vercel reached over 7 million cron invocations per week, and they expect the adoption to continue growing.

How Vercel Cron Jobs Handles Scale
With this pace of adoption, scaling this feature is crucial for Vercel. In order to scale the amount of cron invocations at this pace, they had to make some business and architectural decisions.

From the business perspective, they defined limits for their free-tier customers. Free-tier customers can create a maximum of two cron jobs in their account, and they can only have hourly schedules. This means that free customers cannot run a cron job every 30 minutes; instead, they can do it at most every hour. Only customers on Vercel paid tiers can take advantage of EventBridge Scheduler minute granularity for scheduling tasks.

Also, for free customers, minute precision isn’t guaranteed. To achieve this, Vincent took advantage of the time window configuration from EventBridge Scheduler. The flexible time window configuration allows you to start a schedule within a window of time. This means that the scheduled tasks are dispersed across the time window to reduce the impact of multiple requests on downstream services. This is very useful if, for example, many customers want to run their jobs at midnight. By using the flexible time window, the load can spread across a set window of time.

From the architectural perspective, Vercel took advantage of hosting the APIs and owning the functions that the cron jobs invoke.

This means that when the Lambda function is started by EventBridge Scheduler, the function ends its run without waiting for a response from the API. Then Vercel validates if the cron job ran by checking if the API and Vercel function ran correctly from its observability mechanisms. In this way, the function duration is very short, less than 400 milliseconds. This allows Vercel to run a lot of functions per second without affecting their concurrency limits.

What Was The Impact?
Vercel’s implementation of Cron Jobs is an excellent example of what serverless technologies enable. In two months, with two people working full time, they were able to launch a feature that their community needed and enthusiastically adopted. This feature shows the completeness of Vercel’s platform and is an important feature to convince their customers to move to a paid account.

If you want to get started with EventBridge Scheduler, see Serverless Land patterns for EventBridge Scheduler, where you’ll find a broad range of examples to help you.

— Marcia

Browse Websites by Fonts

2023-09-05T17:24:28.000Z

Vale.sh – A Linter for Prose

2023-09-03T15:58:18.000Z

Invariants: A Better Debugger?

2023-09-02T04:21:25.000Z

Explaining the Postgres iceberg

2023-09-01T22:43:55.000Z

Explaining The Postgres Meme

2023-09-01T22:41:09.000Z

Patterns with Rust Types

2023-09-01T13:16:17.000Z

Things I wish I knew before moving 50K lines of code to React Server Components

2023-09-01T01:29:15.000Z

Show HN: Flake schemas – teaching Nix about your flake outputs

2023-08-31T15:35:48.000Z

Docker Is Four Things

2023-08-31T12:11:47.000Z

HashiCorp silently amend Terraform Registry TOS

2023-08-31T09:19:45.000Z

Lightweight Virtualization: Libkrun, Vsock, Metallize

2023-08-30T16:09:40.000Z

Hacking the LG Monitor's EDID

2023-08-30T15:24:08.000Z

Applying SRE Principles to CI/CD

2023-08-30T02:24:31.000Z

OpenTelemetry in 2023

2023-08-28T14:58:09.000Z

Helm-Compose – The Docker-compose like tool for K8s development

2023-08-28T13:48:44.000Z

Why WhatsApp Was Able to Support 50B Messages a Day with 32 Engineers

2023-08-27T09:42:34.000Z

OpenTF Announces Fork of Terraform

2023-08-25T14:56:48.000Z

FreeBSD on Firecracker

2023-08-24T18:52:46.000Z

FreeBSD on Firecracker

2023-08-24T08:21:13.000Z

AI Real-Time Human Full-Body Photo Generator

2023-08-23T15:28:45.000Z

PostgreSQL Lock Conflicts

2023-08-22T21:53:07.000Z

Godly – Astronomically good web design inspiration

2023-08-22T18:33:43.000Z

Show HN: Points and Miles Database

2023-08-22T15:43:13.000Z

Show HN: FlakeHub – Discover and publish Nix flakes

2023-08-22T15:29:24.000Z

Switching from Chrome to Firefox

2023-08-22T12:20:14.000Z

Podman Desktop celebrates 500k downloads

2023-08-21T23:44:54.000Z

Kris Nova passed away

2023-08-20T14:30:31.000Z

Show HN: Rivet – Open-source game server management with Nomad and Rust

2023-08-19T13:38:22.000Z

System Initiative: remove the papercuts from DevOps work

2023-08-18T07:46:32.000Z

Show HN: Run globally distributed full-stack apps on high-performance MicroVMs

2023-08-17T10:45:45.000Z

Show HN: Repo with a list of 80 decent companies hiring remotely in Europe

2023-08-16T10:09:00.000Z

Requiring ink to scan a document–yet another insult from the printer industry

2023-08-15T20:06:17.000Z

We reduced the cost of building Mastodon at Twitter-scale by 100x

2023-08-15T17:54:13.000Z

Show HN: Layerform – Open-source development environments using Terraform files

2023-08-15T14:15:51.000Z

Microsoft Docker Development Container Templates

2023-08-13T07:17:09.000Z

My $0->$100M->$0 in 5 years story

2023-08-12T17:37:25.000Z

Aurora I/O optimized config saved 90% DB cost

2023-08-10T18:29:48.000Z

Show HN: Infracost (YC W21): Be proactive with your cloud costs

2023-08-09T13:01:14.000Z

Amazon has more than half of all Arm server CPUs in the world

2023-08-09T07:05:20.000Z

Supabase Local Dev: migrations, branching, and observability

2023-08-09T06:40:39.000Z

PackagingCon – a conference only for software package management

2023-08-08T20:59:18.000Z

PackagingCon – A conference only for software package management

2023-08-08T20:59:18.000Z

LCD, Please – de-make of “Papers, please”, celebrating 10 years since launch

2023-08-08T17:30:09.000Z

Show HN: Blueprint for a distributed multi-region IAM with Go and CockroachDB

2023-08-08T13:05:53.000Z

The SEO scam: $62,000 later

2023-08-08T06:58:50.000Z

Incus, a community fork of LXD, now part of Linux Containers

2023-08-07T17:05:22.000Z

Postgres Language Server

2023-08-06T10:26:54.000Z

GitHub Actions and Vanity Metrics

2023-08-05T09:37:09.000Z

Virtualizing Development Environments in 2023

2023-08-04T15:28:04.000Z

Virtualizing development environments in 2023

2023-08-04T15:28:04.000Z

Show HN: Hydra 1.0 – open-source column-oriented Postgres

2023-08-03T16:19:07.000Z

Meta Open Sources AudioCraft: Generative AI for Audio

2023-08-02T15:36:31.000Z

Hardening Workers KV

2023-08-02T13:05:42.000Z

Over the last couple of months, Workers KV has suffered from a series of incidents, culminating in three back-to-back incidents during the week of July 17th, 2023. These incidents have directly impacted customers that rely on KV — and this isn’t good enough.

We’re going to share the work we have done to understand why KV has had such a spate of incidents and, more importantly, share in depth what we’re doing to dramatically improve how we deploy changes to KV going forward.

Workers KV?

Workers KV — or just “KV” — is a key-value service for storing data: specifically, data with high read throughput requirements. It’s especially useful for user configuration, service routing, small assets and/or authentication data.

We use KV extensively inside Cloudflare too, with Cloudflare Access (part of our Zero Trust suite) and Cloudflare Pages being some of our highest profile internal customers. Both teams benefit from KV’s ability to keep regularly accessed key-value pairs close to where they’re accessed, as well its ability to scale out horizontally without any need to become an expert in operating KV.

Given Cloudflare’s extensive use of KV, it wasn’t just external customers impacted. Our own internal teams felt the pain of these incidents, too.

The summary of the post-mortem

Back in June 2023, we announced the move to a new architecture for KV, which is designed to address two major points of customer feedback we’ve had around KV: high latency for infrequently accessed keys (or a key accessed in different regions), and working to ensure the upper bound on KV’s eventual consistency model for writes is 60 seconds — not “mostly 60 seconds”.

At the time of the blog, we’d already been testing this internally, including early access with our community champions and running a small % of production traffic to validate stability and performance expectations beyond what we could emulate within a staging environment.

However, in the weeks between mid-June and culminating in the series of incidents during the week of July 17th, we would continue to increase the volume of new traffic onto the new architecture. When we did this, we would encounter previously unseen problems (many of these customer-impacting) — then immediately roll back, fix bugs, and repeat. Internally, we’d begun to identify that this pattern was becoming unsustainable — each attempt to cut traffic onto the new architecture would surface errors or behaviors we hadn’t seen before and couldn’t immediately explain, and thus we would roll back and assess.

The issues at the root of this series of incidents proved to be significantly challenging to track and observe. Once identified, the two causes themselves proved to be quick to fix, but an (1) observability gap in our error reporting and (2) a mutation to local state that resulted in an unexpected mutation of global state were both hard to observe and reproduce over the days following the customer-facing impact ending.

The detail

One important piece of context to understand before we go into detail on the post-mortem: Workers KV is composed of two separate Workers scripts – internally referred to as the Storage Gateway Worker and SuperCache. SuperCache is an optional path in the Storage Gateway Worker workflow, and is the basis for KV's new (faster) backend (refer to the blog).

Here is a timeline of events:

Time	Description
2023-07-17 21:52 UTC	Cloudflare observes alerts showing 500 HTTP status codes in the MEL01 data-center (Melbourne, AU) and begins investigating. We also begin to see a small set of customers reporting HTTP 500s being returned via multiple channels. It is not immediately clear if this is a data-center-wide issue or KV specific, as there had not been a recent KV deployment, and the issue directly correlated with three data-centers being brought back online.
2023-07-18 00:09 UTC	We disable the new backend for KV in MEL01 in an attempt to mitigate the issue (noting that there had not been a recent deployment or change to the % of users on the new backend).
2023-07-18 05:42 UTC	Investigating alerts showing 500 HTTP status codes in VIE02 (Vienna, AT) and JNB01 (Johannesburg, SA).
2023-07-18 13:51 UTC	The new backend is disabled globally after seeing issues in VIE02 (Vienna, AT) and JNB01 (Johannesburg, SA) data-centers, similar to MEL01. In both cases, they had also recently come back online after maintenance, but it remained unclear as to why KV was failing.
2023-07-20 19:12 UTC	The new backend is inadvertently re-enabled while deploying the update due to a misconfiguration in a deployment script.
2023-07-20 19:33 UTC	The new backend is (re-) disabled globally as HTTP 500 errors return.
2023-07-20 23:46 UTC	Broken Workers script pipeline deployed as part of gradual rollout due to incorrectly defined pipeline configuration in the deployment script. Metrics begin to report that a subset of traffic is being black-holed.
2023-07-20 23:56 UTC	Broken pipeline rolled back; errors rates return to pre-incident (normal) levels.

All timestamps referenced are in Coordinated Universal Time (UTC).

We initially observed alerts showing 500 HTTP status codes in the MEL01 data-center (Melbourne, AU) at 21:52 UTC on July 17th, and began investigating. We also received reports from a small set of customers reporting HTTP 500s being returned via multiple channels. This correlated with three data centers being brought back online, and it was not immediately clear if it related to the data centers or was KV-specific — especially given there had not been a recent KV deployment. On 05:42, we began investigating alerts showing 500 HTTP status codes in VIE02 (Vienna) and JNB02 (Johannesburg) data-centers; while both had recently come back online after maintenance, it was still unclear why KV was failing. At 13:51 UTC, we made the decision to disable the new backend globally.

Following the incident on July 18th, we attempted to deploy an allow-list configuration to reduce the scope of impacted accounts. However, while attempting to roll out a change for the Storage Gateway Worker at 19:12 UTC on July 20th, an older configuration was progressed causing the new backend to be enabled again, leading to the third event. As the team worked to fix this and deploy this configuration, they attempted to manually progress the deployment at 23:46 UTC, which resulted in the passing of a malformed configuration value that caused traffic to be sent to an invalid Workers script configuration.

After all deployments and the broken Workers configuration (pipeline) had been rolled back at 23:56 on the 20th July, we spent the following three days working to identify the root cause of the issue. We lacked observability as KV's Worker script (responsible for much of KV's logic) was throwing an unhandled exception very early on in the request handling process. This was further exacerbated by prior work to disable error reporting in a disabled data-center due to the noise generated, which had previously resulted in logs being rate-limited upstream from our service.

This previous mitigation prevented us from capturing meaningful logs from the Worker, including identifying the exception itself, as an uncaught exception terminates request processing. This has raised the priority of improving how unhandled exceptions are reported and surfaced in a Worker (see Recommendations, below, for further details). This issue was exacerbated by the fact that KV's Worker script would fail to re-enter its "healthy" state when a Cloudflare data center was brought back online, as the Worker was mutating an environment variable perceived to be in request scope, but that was in global scope and persisted across requests. This effectively left the Worker “frozen” with the previous, invalid configuration for the affected locations.

Further, the introduction of a new progressive release process for Workers KV, designed to de-risk rollouts (as an action from a prior incident), prolonged the incident. We found a bug in the deployment logic that led to a broader outage due to an incorrectly defined configuration.

This configuration effectively caused us to drop a single-digit % of traffic until it was rolled back 10 minutes later. This code is untested at scale, and we need to spend more time hardening it before using it as the default path in production.

Additionally: although the root cause of the incidents was limited to three Cloudflare data-centers (Melbourne, Vienna, and Johannesburg), traffic across these regions still uses these data centers to route reads and writes to our system of record. Because these three data centers participate in KV’s new backend as regional tiers, a portion of traffic across the Oceania, Europe, and African regions was affected. Only a portion of keys from enrolled namespaces use any given data center as a regional tier in order to limit a single (regional) point of failure, so while traffic across all data centers in the region was impacted, nowhere was all traffic in a given data center affected.

We estimated the affected traffic to be 0.2-0.5% of KV's global traffic (based on our error reporting), however we observed some customers with error rates approaching 20% of their total KV operations. The impact was spread across KV namespaces and keys for customers within the scope of this incident.

Both KV’s high total traffic volume and its role as a critical dependency for many customers amplify the impact of even small error rates. In all cases, once the changes were rolled back, errors returned to normal levels and did not persist.

Thinking about risks in building software

Before we dive into what we’re doing to significantly improve how we build, test, deploy and observe Workers KV going forward, we think there are lessons from the real world that can equally apply to how we improve the safety factor of the software we ship.

In traditional engineering and construction, there is an extremely common procedure known as a “JSEA”, or Job Safety and Environmental Analysis (sometimes just “JSA”). A JSEA is designed to help you iterate through a list of tasks, the potential hazards, and most importantly, the controls that will be applied to prevent those hazards from damaging equipment, injuring people, or worse.

One of the most critical concepts is the “hierarchy of controls” — that is, what controls should be applied to mitigate these hazards. In most practices, these are elimination, substitution, engineering, administration and personal protective equipment. Elimination and substitution are fairly self-explanatory: is there a different way to achieve this goal? Can we eliminate that task completely? Engineering and administration ask us whether there is additional engineering work, such as changing the placement of a panel, or using a horizontal boring machine to lay an underground pipe vs. opening up a trench that people can fall into.

The last and lowest on the hierarchy, is personal protective equipment (PPE). A hard hat can protect you from severe injury from something falling from above, but it’s a last resort, and it certainly isn’t guaranteed. In engineering practice, any hazard that only lists PPE as a mitigating factor is unsatisfactory: there must be additional controls in place. For example, instead of only wearing a hard hat, we should engineer the floor of scaffolding so that large objects (such as a wrench) cannot fall through in the first place. Further, if we require that all tools are attached to the wearer, then it significantly reduces the chance the tool can be dropped in the first place. These controls ensure that there are multiple degrees of mitigation — defense in depth — before your hard hat has to come into play.

Coming back to software, we can draw parallels between these controls: engineering can be likened to improving automation, gradual rollouts, and detailed metrics. Similarly, personal protective equipment can be likened to code review: useful, but code review cannot be the only thing protecting you from shipping bugs or untested code. Automation with linters, more robust testing, and new metrics are all vastly safer ways of shipping software.

As we spent time assessing where to improve our existing controls and how to put new controls in place to mitigate risks and improve the reliability (safety) of Workers KV, we took a similar approach: eliminating unnecessary changes, engineering more resilience into our codebase, automation, deployment tooling, and only then looking at human processes.

How we plan to get better

Cloudflare is undertaking a larger, more structured review of KV's observability tooling, release infrastructure and processes to mitigate not only the contributing factors to the incidents within this report, but recent incidents related to KV. Critically, we see tooling and automation as the most powerful mechanisms for preventing incidents, with process improvements designed to provide an additional layer of protection. Process improvements alone cannot be the only mitigation.

Specifically, we have identified and prioritized the below efforts as the most important next steps towards meeting our own availability SLOs, and (above all) make KV a service that customers building on Workers can rely on for storing configuration and service data in the hot path of their traffic:

Substantially improve the existing observability tooling for unhandled exceptions, both for internal teams and customers building on Workers. This is especially critical for high-volume services, where traditional logging alone can be too noisy (and not specific enough) to aid in tracking down these cases. The existing ongoing work to land this will be prioritized further. In the meantime, we have directly addressed the specific uncaught exception with KV's primary Worker script.
Improve the safety around the mutation of environmental variables in a Worker, which currently operate at "global" (per-isolate) scope, but can appear to be per-request. Mutating an environmental variable in request scope mutates the value for all requests transiting that same isolate (in a given location), which can be unexpected. Changes here will need to take backwards compatibility in mind.
Continue to expand KV’s test coverage to better address the above issues, in parallel with the aforementioned observability and tooling improvements, as an additional layer of defense. This includes allowing our test infrastructure to simulate traffic from any source data-center, which would have allowed us to more quickly reproduce the issue and identify a root cause.
Improvements to our release process, including how KV changes and releases are reviewed and approved, going forward. We will enforce a higher level of scrutiny for future changes, and where possible, reduce the number of changes deployed at once. This includes taking on new infrastructure dependencies, which will have a higher bar for both design and testing.
Additional logging improvements, including sampling, throughout our request handling process to improve troubleshooting & debugging. A significant amount of the challenge related to these incidents was due to the lack of logging around specific requests (especially non-2xx requests)
Review and, where applicable, improve alerting thresholds surrounding error rates. As mentioned previously in this report, sub-% error rates at a global scale can have severe negative impact on specific users and/or locations: ensuring that errors are caught and not lost in the noise is an ongoing effort.
Address maturity issues with our progressive deployment tooling for Workers, which is net-new (and will eventually be exposed to customers directly).

This is not an exhaustive list: we're continuing to expand on preventative measures associated with these and other incidents. These changes will not only improve KVs reliability, but other services across Cloudflare that KV relies on, or that rely on KV.

We recognize that KV hasn’t lived up to our customers’ expectations recently. Because we rely on KV so heavily internally, we’ve felt that pain first hand as well. The work to fix the issues that led to this cycle of incidents is already underway. That work will not only improve KV’s reliability but also improve the reliability of any software written on the Cloudflare Workers developer platform, whether by our customers or by ourselves.

So, you want to deploy on the edge?

2023-07-31T11:54:00.000Z

Serverless Functions Post-Mortem

2023-07-28T20:37:02.000Z

New – AWS Public IPv4 Address Charge + Public IP Insights

2023-07-28T18:01:23.000Z

We are introducing a new charge for public IPv4 addresses. Effective February 1, 2024 there will be a charge of $0.005 per IP per hour for all public IPv4 addresses, whether attached to a service or not (there is already a charge for public IPv4 addresses you allocate in your account but don’t attach to an EC2 instance).

Public IPv4 Charge
As you may know, IPv4 addresses are an increasingly scarce resource and the cost to acquire a single public IPv4 address has risen more than 300% over the past 5 years. This change reflects our own costs and is also intended to encourage you to be a bit more frugal with your use of public IPv4 addresses and to think about accelerating your adoption of IPv6 as a modernization and conservation measure.

This change applies to all AWS services including Amazon Elastic Compute Cloud (Amazon EC2), Amazon Relational Database Service (RDS) database instances, Amazon Elastic Kubernetes Service (EKS) nodes, and other AWS services that can have a public IPv4 address allocated and attached, in all AWS regions (commercial, AWS China, and GovCloud). Here’s a summary in tabular form:

Public IP Address Type	Current Price/Hour (USD)	New Price/Hour (USD) (Effective February 1, 2024)
In-use Public IPv4 address (including Amazon provided public IPv4 and Elastic IP) assigned to resources in your VPC, Amazon Global Accelerator, and AWS Site-to-site VPN tunnel	No charge	$0.005
Additional (secondary) Elastic IP Address on a running EC2 instance	$0.005	$0.005
Idle Elastic IP Address in account	$0.005	$0.005

The AWS Free Tier for EC2 will include 750 hours of public IPv4 address usage per month for the first 12 months, effective February 1, 2024. You will not be charged for IP addresses that you own and bring to AWS using Amazon BYOIP.

Starting today, your AWS Cost and Usage Reports automatically include public IPv4 address usage. When this price change goes in to effect next year you will also be able to use AWS Cost Explorer to see and better understand your usage.

As I noted earlier in this post, I would like to encourage you to consider accelerating your adoption of IPv6. A new blog post shows you how to use Elastic Load Balancers and NAT Gateways for ingress and egress traffic, while avoiding the use of a public IPv4 address for each instance that you launch. Here are some resources to show you how you can use IPv6 with widely used services such as EC2, Amazon Virtual Private Cloud (Amazon VPC), Amazon Elastic Kubernetes Service (EKS), Elastic Load Balancing, and Amazon Relational Database Service (RDS):

Earlier this year we enhanced EC2 Instance Connect and gave it the ability to connect to your instances using private IPv4 addresses. As a result, you no longer need to use public IPv4 addresses for administrative purposes (generally using SSH or RDP).

Public IP Insights
In order to make it easier for you to monitor, analyze, and audit your use of public IPv4 addresses, today we are launching Public IP Insights, a new feature of Amazon VPC IP Address Manager that is available to you at no cost. In addition to helping you to make efficient use of public IPv4 addresses, Public IP Insights will give you a better understanding of your security profile. You can see the breakdown of public IP types and EIP usage, with multiple filtering options:

You can also see, sort, filter, and learn more about each of the public IPv4 addresses that you are using:

Using IPv4 Addresses Efficiently
By using the new IP Insights tool and following the guidance that I shared above, you should be ready to update your application to minimize the effect of the new charge. You may also want to consider using AWS Direct Connect to set up a dedicated network connection to AWS.

Finally, be sure to read our new blog post, Identify and Optimize Public IPv4 Address Usage on AWS, for more information on how to make the best use of public IPv4 addresses.

— Jeff;

Show HN: Envoy playground in the browser

2023-07-27T19:59:45.000Z

Building and operating a pretty big storage system called S3

2023-07-27T15:20:35.000Z

Advice for Operating a Public-Facing API

2023-07-22T00:36:29.000Z

Using Workforce Identity Federation with API-based web applications

2023-07-21T16:00:00.000Z

Workforce Identity Federation allows use of an external identity provider (IdP) to authenticate and authorize users (including employees, partners, and contractors) to Google Cloud resources without provisioning identities in Cloud Identity. Before its introduction, only identities existing within Cloud Identity could be used with Cloud Identity Access Management (IAM).

Here’s how to configure an example Javascript web application hosted in Google Cloud to call Google Cloud APIs after being authenticated with an Azure AD using Workforce Identity Federation.

Workforce Identity can be used with IdPs supporting OpenID Connect (OIDC) or SAML 2.0. You can read more about it in our blog post and product documentation page.

Configuring Workforce Identity Federation

There will be three high level configuration steps required:

Prepare your external IdP and get required configuration parameters.
Create a logical container for your external identities in Google Cloud in the form of Workforce Identity Pool.
Establish relation between your Workforce Identity Pool and external IdP by configuring WorkforceIdentity Pool Provider using information gathered in the first step.

Before Workforce Identity Federation can be used, a one-way trust relationship must be established between your Google Cloud environment and external IdP. This is achieved by configuring the following resources in Google Cloud: 1) a Workforce Identity Pool, which is a logical container for external identities and 2) a corresponding Workforce Identity Pool Provider, encapsulating technical details of external IdP integration.

Understanding of OIDC flow may be helpful to understand integration with your application code. We will focus on a single page web application which calls Google Cloud APIs. For simplicity, we omit details of protocol, such as audiences and claims, as they are irrelevant to understanding the flow:

Client downloads a web app with JS code. In our example, static content is exposed from the GCS storage bucket.
The unauthenticated user is redirected to an external IdP login page for authentication.
On successful login, the external IdP returns the authentication result including an ID Token.
The ID Token contains information about identity and can be exchanged into an “access token”. This is accomplished on the Google Cloud side with a service called Secure Token Service (STS) (API documentation).
STS verifies the ID Token and if successful returns a Google Identity access token.
Access token can be used as a bearer token in subsequent Google Cloud API calls. Please note: by default, access tokens are good for 1 hour (3,600 seconds). When the access token has expired, your token management code must get a new one.

As an example of incorporating this flow within your application we will use Azure Active Directory as an external IdP.

IdP configuration

Azure AD requires following steps to act as IdP for Workforce Identity Federation:

Registering application
Assigning users (and groups) to the enterprise application

Azure AD performs identity management only for registered applications. This is why the first thing we need to do is to create a new application registration (go to Azure Active Directory and select App registrations). An important parameter to assign is the type of redirect URI, in our case we choose “Single-page application (SPA)” If our goal is to provide integration with Google Cloud Federated Console (console.cloud.google) we need to choose a type of “Web”. More information about configuration and the federated version of Cloud Console is provided in Configure Azure AD-based workforce identity federation.

Registering an application in Azure Active Directory

Next step is to choose “ID Tokens” from the list of tokens issued by authorization endpoint. For details see OpenID Connect (OIDC).

From the information screen after you finish registration note “Application (client) ID” (this is “client id” needed for Identity Workforce Pool Provider configuration in Google Cloud), and click on “Endpoints” above.

From the endpoints window copy the “OpenID Connect metadata document” URL, navigate to it and look for the “issuer” field. Value of this field (in the form of https://login.microsoftonline.com/TENANT_ID/v2.0) will be required for the Workforce Identity Federation).

Last step is to go to “Enterprise applications”, find your application and assign users and groups that you want to give access to your application.

Configure Google Cloud Workforce Identity Federation

Configurations steps to be executed on Google Cloud:

Specify billing project
Enable APIs
Create Workforce Identity Pool
Create Workforce Identity Pool Provider
Assign required permissions to external identities from the Pool

Before you begin make sure APIs are enabled on the billing project (as Workforce Identity Federation artifacts are at organization level, you need to specify the project which will be used for billing associated with those resources):

IAM API
Security Token Service API

You can find detailed information in the “Before you begin” section of product documentation.

Create Workforce Identity Pool

code_block: [StructValue([(u'code', u'gcloud iam workforce-pools create $WORKFORCE_POOL_ID \\\r\n --organization=$ORGANIZATION_ID \\\r\n --description="$WORKFORCE_POOL_DESCR" \\\r\n --location=global \\\r\n --billing-project="$PROJECT_ID"'), (u'language', u''), (u'caption', )])]

Please consult details of configuration parameters in the corresponding section of Cloud SDK documentation.

Create Workforce Identity Pool Provider

code_block: [StructValue([(u'code', u'gcloud iam workforce-pools providers create-oidc $WORKFORCE_PROVIDER_ID \\\r\n --workforce-pool=$WORKFORCE_POOL_ID \\\r\n --display-name="$WORKFORCE_PROVIDER_ID" \\\r\n --description="$WORKFORCE_PROVIDER_ID" \\\r\n --issuer-uri="$ISSUER_URI" \\\r\n --client-id="$CLIENT_ID" \\ --attribute-mapping="google.subject=assertion.preferred_username,google.groups=assertion.groups" \\\r\n --location=global \\\r\n --billing-project="$PROJECT_ID"'), (u'language', u''), (u'caption', )])]

This is a crucial step in the configuration, as here we establish one-way trust to our Idp. We will need information from the Azure environment: issuer uri and client id, we should have them from the previous step.

Please note the attribute mapping parameter where we decide which attribute (assertion) from IdP we will use for the required google.subject attribute. In our example we use preferred_username assertion which in case of Azure AD carries email of authenticated user. This determines the syntax we will use for referencing external identities in Google Cloud as described in Represent workforce pool users in IAM policies.

Now we just need to assign the correct set of roles to our external identities. In the following example we assign the serviceUserConsumer role, which is required to consume any Google Cloud API, so will also be necessary for your external identities.

code_block: [StructValue([(u'code', u'gcloud projects add-iam-policy-binding $PROJECT_ID \\\r\n --role="roles/serviceusage.serviceUsageConsumer" \\\r\n--member="principal://iam.googleapis.com/locations/global/workforcePools/$WORKFORCE_POOL_ID/subject/$TEST_SUBJECT"'), (u'language', u''), (u'caption', )])]

$TEST_SUBJECT in our case is an email of one of Azure AD users we assigned to our enterprise application.

Now our Workforce Identity Federation should be ready to rock’n’roll!

Example Web Application

To use Azure AD ID Tokens in a Web Application, Microsoft recommends using the Microsoft Authentication Library for JavaScript (MSAL.js). Several wrappers exist for this library to be used in e.g. Node.js, React and Angular.

To demonstrate using the Workforce Identity Federation, the following example is provided in Javascript using the Microsoft Authentication Library for JavaScript (MSAL.js) 2.0 for Browser-Based Single-Page Applications.

As a first step, the MSAL.js library must be loaded and initialized. For simplicity we are using the CDN version of the library, in most cases the NPM package should be used instead. The script is loaded in the HTML body with async and defer options to ensure that it does not interfere with page loading time.

code_block: [StructValue([(u'code', u'\r\n\r\n \r\n\r\n'), (u'language', u''), (u'caption', )])]

MSAL.js can use a popup or redirect login. Without a backend to handle the redirect, the popup method is the recommended method. To show a popup without triggering the popup blocker in modern browsers, the popup needs to be triggered by a user action (e.g. a button click) and must happen within a short period of time after the button was clicked.

As a result an HTML button is added to trigger the login popup and inline Javascript with a setup function called after the MSAL.js library is loaded, which enables the login button and initializes a PublicClientApplication from the MSAL.js library. The PublicClientApplication initialization requires the clientId and authority as defined in the AD Single Page Setup above.

code_block: [StructValue([(u'code', u'\r\n \r\n\r\n \r\n \r\n\r\n'), (u'language', u''), (u'caption', )])]

Now the login functionality is added to open the login popup and handle the login response. The response contains an access token and an ID token. The ID token will be sent to STS to be exchanged for a Google Identity access token. The access token returned after successful login is intended to be used with Azure.

In our example, it is used to query the Graph API to retrieve the user information from AD. To access Google Cloud resources we need to exchange an ID token from Azure for another access token using Google Cloud STS, based on trust established by the Workforce Identity Federation setup. The Google Identity access token can now be used to access Google APIs, like e.g. the resourcemanager API to retrieve the list of projects visible to the current user (requires that the user has the IAM permission resourcemanager.projects.get)

Please note that in calling Google Cloud STS we need to provide a proper audience, which is the URI of our Workforce Identity Federation pool provider.

code_block: [StructValue([(u'code', u'\r\n \r\n \r\n \r\n
\r\n
\r\n \r\n \r\n'), (u'language', u''), (u'caption', )])]

The access token can then be used to authenticate against all APIs and services which support Workforce Identity Federation.

The code shared above must be hosted on a valid HTTPS endpoint which is configured in AD as an endpoint for the single page application. This can be achieved with GCLB and a public GCS bucket.

Users and Groups from Active Directory can be represented in IAM policies using the following format:

Important: All principals must have the role roles/serviceusage.serviceUsageConsumer which contains IAM permission serviceusage.services.use.

Troubleshooting

As with every software development exercise, sooner or later something goes wrong, here is the list of hints we think may be useful when things are not going as planned.

Familiarize yourself with IAM logging: Read the Example logs for Workforce Identity Federation section of IAM documentation.

Use Cloud Logging: This should always be the first thing to do, check the logs in Logs Explorer looking for errors, unauthorized calls, and so on. Remember, to view logs you need corresponding permissions on the project.

Check permissions: In the text we mentioned permissions required for the external identity to call Google Cloud APIs, check if your external identity has been assigned permissions including roles/serviceusage.serviceUsageConsumer role and roles associated with APIs being called. Check preliminary requirements again.

Check your audience: When calling STS for token exchange you must have the correct audience set, which is referring to your Workforce Identity Pool Provider, check our code example for syntax.

Introducing Workforce Identity Federation to easily manage workforce access to Google Cloud

Workforce Identity Federation can help users onboard to Google Cloud using their identity and credentials that currently exist with their...

Read Article

Fly.io Postgres cluster down for 3 days, no word from them about it

2023-07-20T23:42:10.000Z

LinkedIn adopts protocol buffers and reduces latency up to 60%

2023-07-19T22:33:52.000Z

K9s: A lazier way to manage Kubernetes Clusters

2023-07-19T01:18:08.000Z

AWS Fargate Enables Faster Container Startup using Seekable OCI

2023-07-17T19:29:14.000Z

While developing with containers is becoming an increasingly popular way for deploying and scaling applications, there are still areas where improvements can be made. One of the main issues with scaling containerized applications is the long startup time, especially during scale up when newer instances need to be added. This issue can have a negative impact on the customer experience, for example when a website needs to scale out to serve additional traffic.

A research paper shows that container image downloads account for 76 percent of container startup time, but on average only 6.4 percent of the data is needed for the container to start doing useful work. Starting and scaling out containerized applications requires downloading container images from a remote container registry. This may introduce a non-trivial latency, as the entire image must be downloaded and unpacked before the applications can be started.

One solution to this problem is lazy loading (also known as asynchronous loading) container images. This approach downloads data from the container registry in parallel with the application startup, such as stargz-snapshotter, a project that aims to improve the overall container start time.

Last year, we introduced Seekable OCI (SOCI), a technology open sourced by Amazon Web Services (AWS) that enables container runtimes to implement lazy loading the container image to start applications faster without modifying the container images. As part of that effort, we open sourced SOCI Snapshotter, a snapshotter plugin that enables lazy loading with SOCI in containerd.

AWS Fargate Support for SOCI
Today, I’m excited to share that AWS Fargate now supports Seekable OCI (SOCI), which helps applications deploy and scale out faster by enabling containers to start without waiting to download the entire container image. At launch, this new capability is available for Amazon Elastic Container Service (Amazon ECS) applications running on AWS Fargate.

Here’s a quick look to show how AWS Fargate support for SOCI works:

SOCI works by creating an index (SOCI index) of the files within an existing container image. This index is a key enabler to launching containers faster, providing the capability to extract an individual file from a container image without having to download the entire image. Your applications no longer need to wait to complete pulling and unpacking a container image before your applications start running. This allows you to deploy and scale out applications more quickly and reduce the rollout time for application updates.

A SOCI index is generated and stored separately from the container images. This means that your container images don’t need to be converted to use SOCI, therefore not breaking secure hash algorithm (SHA)-based security, such as container image signing. The index is then stored in the registry alongside the container image. At release, AWS Fargate support for SOCI works with Amazon Elastic Container Registry (Amazon ECR).

When you use Amazon ECS with AWS Fargate to run your SOCI-indexed containerized images, AWS Fargate automatically detects if a SOCI index for the image exists and starts the container without waiting for the entire image to be pulled. This also means that AWS Fargate will still continue to run container images that don’t have SOCI indexes.

Let’s Get Started
There are two ways to create SOCI indexes for container images.

Use AWS SOCI Index Builder – AWS SOCI Index Builder is a serverless solution for indexing container images in the AWS Cloud. This AWS CloudFormation stack deploys an Amazon EventBridge rule to identify Amazon ECR action events and invoke an AWS Lambda function to match the defined filter. Then, another AWS Lambda function generates and pushes SOCI indexes to repositories in the Amazon ECR registry.
Create SOCI indexes manually – This approach provides more flexibility on in how the SOCI indexes are created, including for existing container images in Amazon ECR repositories. To create SOCI indexes, you can use the soci CLI provided by the soci-snapshotter project.

The AWS SOCI Index Builder provides you with an automated process to get started and build SOCI indexes for your container images. The sociCLI provides you with more flexibility around index generation and the ability to natively integrate index generation in your CI/CD pipelines.

In this article, I manually generate SOCI indexes using the soci CLI from the soci-snapshotter project.

Create a Repository and Push Container Images
First, I create an Amazon ECR repository called pytorch-socifor my container image using AWS CLI.

$ aws ecr create-repository --region us-east-1 --repository-name pytorch-soci

I keep the Amazon ECR URI output and define it as a variable to make it easier for me to refer to the repository in the next step.

$ ECRSOCIURI=xyz.dkr.ecr.us-east-1.amazonaws.com/pytorch-soci:latest

For the sample application, I use a PyTorch training (CPU-based) container image from AWS Deep Learning Containers. I use the nerdctl CLI to pull the container image because, by default, the Docker Engine stores the container image in the Docker Engine image store, not the containerd image store.

$ SAMPLE_IMAGE="763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:1.5.1-cpu-py36-ubuntu16.04" 
$ aws ecr get-login-password --region us-east-1 | sudo nerdctl login --username AWS --password-stdin xyz.dkr.ecr.ap-southeast-1.amazonaws.com
$ sudo nerdctl pull --platform linux/amd64 $SAMPLE_IMAGE

Then, I tag the container image for the repository that I created in the previous step.

$ sudo nerdctl tag $SAMPLE_IMAGE $ECRSOCIURI

Next, I need to push the container image into the ECR repository.

$ sudo nerdctl push $ECRSOCIURI

At this point, my container image is already in my Amazon ECR repository.

Create SOCI Indexes
Next, I need to create SOCI index.

A SOCI index is an artifact that enables lazy loading of container images. A SOCI index consists of 1) a SOCI index manifest and 2) a set of zTOCs. The following image illustrates the components in a SOCI index manifest, and how it refers to a container image manifest.

The SOCI index manifest contains the list of zTOCs and a reference to the image for which the manifest was generated. A zTOC, or table of contents for compressed data, consists of two parts:

TOC, a table of contents containing file metadata and the corresponding offset in the decompressed TAR archive.
zInfo, a collection of checkpoints representing the state of the compression engine at various points in the layer.

To learn more about the concept and term, please visit soci-snapshotter Terminology page.

Before I can create SOCI indexes, I need to install the sociCLI. To learn more about how to install the soci, visit Getting Started with soci-snapshotter.

To create SOCI indexes, I use the soci create command.

$ sudo soci create $ECRSOCIURI
layer sha256:4c6ec688ebe374ea7d89ce967576d221a177ebd2c02ca9f053197f954102e30b -> ztoc skipped
layer sha256:ab09082b308205f9bf973c4b887132374f34ec64b923deef7e2f7ea1a34c1dad -> ztoc skipped
layer sha256:cd413555f0d1643e96fe0d4da7f5ed5e8dc9c6004b0731a0a810acab381d8c61 -> ztoc skipped
layer sha256:eee85b8a173b8fde0e319d42ae4adb7990ed2a0ce97ca5563cf85f529879a301 -> ztoc skipped
layer sha256:3a1b659108d7aaa52a58355c7f5704fcd6ab1b348ec9b61da925f3c3affa7efc -> ztoc skipped
layer sha256:d8f520dcac6d926130409c7b3a8f77aea639642ba1347359aaf81a8b43ce1f99 -> ztoc skipped
layer sha256:d75d26599d366ecd2aa1bfa72926948ce821815f89604b6a0a49cfca100570a0 -> ztoc skipped
layer sha256:a429d26ed72a85a6588f4b2af0049ae75761dac1bb8ba8017b8830878fb51124 -> ztoc skipped
layer sha256:5bebf55933a382e053394e285accaecb1dec9e215a5c7da0b9962a2d09a579bc -> ztoc skipped
layer sha256:5dfa26c6b9c9d1ccbcb1eaa65befa376805d9324174ac580ca76fdedc3575f54 -> ztoc skipped
layer sha256:0ba7bf18aa406cb7dc372ac732de222b04d1c824ff1705d8900831c3d1361ff5 -> ztoc skipped
layer sha256:4007a89234b4f56c03e6831dc220550d2e5fba935d9f5f5bcea64857ac4f4888 -> ztoc sha256:0b4d78c856b7e9e3d507ac6ba64e2e2468997639608ef43c088637f379bb47e4
layer sha256:089632f60d8cfe243c5bc355a77401c9a8d2f415d730f00f6f91d44bb96c251b -> ztoc sha256:f6a16d3d07326fe3bddbdb1aab5fbd4e924ec357b4292a6933158cc7cc33605b
layer sha256:f18dd99041c3095ade3d5013a61a00eeab8b878ba9be8545c2eabfbca3f3a7f3 -> ztoc sha256:95d7966c964dabb54cb110a1a8373d7b88cfc479336d473f6ba0f275afa629dd
layer sha256:69e1edcfbd217582677d4636de8be2a25a24775469d677664c8714ed64f557c3 -> ztoc sha256:ac0e18bd39d398917942c4b87ac75b90240df1e5cb13999869158877b400b865

From the above output, I can see that sociCLI created zTOCs for four layers, which and this means only these four layers will be lazily pulled and the other container image layers will be downloaded in full before the container image starts. This is because there is less of a launch time impact in lazy loading very small container image layers. However, you can configure this behavior using the --min-layer-size flag when you run soci create.

Verify and Push SOCI Indexes
The soci CLI also provides several commands that can help you to review the SOCI Indexes that have been generated.

To see a list of all index manifests, I can run the following command.

$ sudo soci index list

DIGEST                                                                     SIZE    IMAGE REF                                                                                   PLATFORM       MEDIA TYPE                                    CREATED
sha256:ea5c3489622d4e97d4ad5e300c8482c3d30b2be44a12c68779776014b15c5822    1931    xyz.dkr.ecr.us-east-1.amazonaws.com/pytorch-soci:latest                                     linux/amd64    application/vnd.oci.image.manifest.v1+json    10m4s ago
sha256:ea5c3489622d4e97d4ad5e300c8482c3d30b2be44a12c68779776014b15c5822    1931    763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:1.5.1-cpu-py36-ubuntu16.04    linux/amd64    application/vnd.oci.image.manifest.v1+json    10m4s ago

While optional, if I need to see the list of zTOC, I can use the following command.

$ sudo soci ztoc list
DIGEST                                                                     SIZE        LAYER DIGEST
sha256:0b4d78c856b7e9e3d507ac6ba64e2e2468997639608ef43c088637f379bb47e4    2038072     sha256:4007a89234b4f56c03e6831dc220550d2e5fba935d9f5f5bcea64857ac4f4888
sha256:95d7966c964dabb54cb110a1a8373d7b88cfc479336d473f6ba0f275afa629dd    11442416    sha256:f18dd99041c3095ade3d5013a61a00eeab8b878ba9be8545c2eabfbca3f3a7f3
sha256:ac0e18bd39d398917942c4b87ac75b90240df1e5cb13999869158877b400b865    36277264    sha256:69e1edcfbd217582677d4636de8be2a25a24775469d677664c8714ed64f557c3
sha256:f6a16d3d07326fe3bddbdb1aab5fbd4e924ec357b4292a6933158cc7cc33605b    10152696    sha256:089632f60d8cfe243c5bc355a77401c9a8d2f415d730f00f6f91d44bb96c251b

This series of zTOCs contains all of the information that SOCI needs to find a given file in a layer. To review the zTOC for each layer, I can use one of the digest sums from the preceding output and use the following command.

$ sudo soci ztoc info sha256:0b4d78c856b7e9e3d507ac6ba64e2e2468997639608ef43c088637f379bb47e4
{
  "version": "0.9",
  "build_tool": "AWS SOCI CLI v0.1",
  "size": 2038072,
  "span_size": 4194304,
  "num_spans": 33,
  "num_files": 5552,
  "num_multi_span_files": 26,
  "files": [
    {
      "filename": "bin/",
      "offset": 512,
      "size": 0,
      "type": "dir",
      "start_span": 0,
      "end_span": 0
    },
    {
      "filename": "bin/bash",
      "offset": 1024,
      "size": 1037528,
      "type": "reg",
      "start_span": 0,
      "end_span": 0
    }

---Trimmed for brevity---

Now, I need to use the following command to push all SOCI-related artifacts into the Amazon ECR.

$ PASSWORD=$(aws ecr get-login-password --region us-east-1)
$ sudo soci push --user AWS:$PASSWORD $ECRSOCIURI

If I go to my Amazon ECR repository, I can verify the index is created. Here, I can see that two additional objects are listed alongside my container image: a SOCI Index and an Image index. The image index allows AWS Fargate to look up SOCI indexes associated with my container image.

Understanding SOCI Performance
The main objective of SOCI is to minimize the required time to start containerized applications. To measure the performance of AWS Fargate lazy loading container images using SOCI, I need to understand how long it takes for my container images to start with SOCI and without SOCI.

To understand the duration needed for each container image to start, I can use metrics available from the DescribeTasks API on Amazon ECS. The first metric is createdAt, the timestamp for the time when the task was created and entered the PENDING state. The second metric is startedAt, the time when the task transitioned from the PENDING state to the RUNNING state.

For this, I have created another Amazon ECR repository using the same container image but without generating a SOCI index, called pytorch-without-soci. If I compare these container images, I have two additional objects in pytorch-soci(an image index and a SOCI index) that don’t exist in pytorch-without-soci.

Deploy and Run Applications
To run the applications, I have created an Amazon ECS cluster called demo-pytorch-soci-cluster, a VPC and the required ECS task execution role. If you’re new to Amazon ECS, you can follow Getting started with Amazon ECS to be more familiar with how to deploy and run your containerized applications.

Now, let’s deploy and run both the container images with FARGATE as the launch type. I deﬁne five tasks for each pytorch-sociand pytorch-without-soci.

$ aws ecs \ 
    --region us-east-1 \ 
    run-task \ 
    --count 5 \ 
    --launch-type FARGATE \ 
    --task-definition arn:aws:ecs:us-east-1:XYZ:task-definition/pytorch-soci \ 
    --cluster socidemo 

$ aws ecs \ 
    --region us-east-1 \ 
    run-task \ 
    --count 5 \ 
    --launch-type FARGATE \ 
    --task-definition arn:aws:ecs:us-east-1:XYZ:task-definition/pytorch-without-soci \ 
    --cluster socidemo

After a few minutes, there are 10 running tasks on my ECS cluster.

After verifying that all my tasks are running, I run the following script to get two metrics: createdAt and startedAt.

#!/bin/bash
CLUSTER=
TASKDEF=
REGION="us-east-1"
TASKS=$(aws ecs list-tasks \
    --cluster $CLUSTER \
    --family $TASKDEF \
    --region $REGION \
    --query 'taskArns[*]' \
    --output text)

aws ecs describe-tasks \
    --tasks $TASKS \
    --region $REGION \
    --cluster $CLUSTER \
    --query "tasks[] | reverse(sort_by(@, &createdAt)) | [].[{startedAt: startedAt, createdAt: createdAt, taskArn: taskArn}]" \
    --output table

Running the above command for the container image without SOCI indexes — pytorch-without-soci— produces following output:

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|                                                                                   DescribeTasks                                                                                   |
+----------------------------------+-----------------------------------+------------------------------------------------------------------------------------------------------------+
|             createdAt            |             startedAt             |                                                  taskArn                                                   |
+----------------------------------+-----------------------------------+------------------------------------------------------------------------------------------------------------+
|  2023-07-07T17:43:59.233000+00:00|  2023-07-07T17:46:09.856000+00:00 |  arn:aws:ecs:ap-southeast-1:xyz:task/demo-pytorch-soci-cluster/dcdf19b6e66444aeb3bc607a3114fae0   |
|  2023-07-07T17:43:59.233000+00:00|  2023-07-07T17:46:09.459000+00:00 |  arn:aws:ecs:ap-southeast-1:xyz:task/demo-pytorch-soci-cluster/9178b75c98ee4c4e8d9c681ddb26f2ca   |
|  2023-07-07T17:43:59.233000+00:00|  2023-07-07T17:46:21.645000+00:00 |  arn:aws:ecs:ap-southeast-1:xyz:task/demo-pytorch-soci-cluster/7da51e036c414cbab7690409ce08cc99   |
|  2023-07-07T17:43:59.233000+00:00|  2023-07-07T17:46:00.606000+00:00 |  arn:aws:ecs:ap-southeast-1:xyz:task/demo-pytorch-soci-cluster/5ee8f48194874e6dbba75a5ef753cad2   |
|  2023-07-07T17:43:59.233000+00:00|  2023-07-07T17:46:02.461000+00:00 |  arn:aws:ecs:ap-southeast-1:xyz:task/demo-pytorch-soci-cluster/58531a9e94ed44deb5377fa997caec36   |
+----------------------------------+-----------------------------------+------------------------------------------------------------------------------------------------------------+

From the average aggregated delta time (between startedAt and createdAt) for each task, the pytorch-without-soci (without SOCI indexes) successfully ran after 129 seconds.

Next, I’m running same command but for pytorch-sociwhich comes with SOCI indexes.

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|                                                                                   DescribeTasks                                                                                   |
+----------------------------------+-----------------------------------+------------------------------------------------------------------------------------------------------------+
|             createdAt            |             startedAt             |                                                  taskArn                                                   |
+----------------------------------+-----------------------------------+------------------------------------------------------------------------------------------------------------+
|  2023-07-07T17:43:53.318000+00:00|  2023-07-07T17:44:51.076000+00:00 |  arn:aws:ecs:ap-southeast-1:xyz:task/demo-pytorch-soci-cluster/c57d8cff6033494b97f6fd0e1b797b8f   |
|  2023-07-07T17:43:53.318000+00:00|  2023-07-07T17:44:52.212000+00:00 |  arn:aws:ecs:ap-southeast-1:xyz:task/demo-pytorch-soci-cluster/6d168f9e99324a59bd6e28de36289456   |
|  2023-07-07T17:43:53.318000+00:00|  2023-07-07T17:45:05.443000+00:00 |  arn:aws:ecs:ap-southeast-1:xyz:task/demo-pytorch-soci-cluster/4bdc43b4c1f84f8d9d40dbd1a41645da   |
|  2023-07-07T17:43:53.318000+00:00|  2023-07-07T17:44:50.618000+00:00 |  arn:aws:ecs:ap-southeast-1:xyz:task/demo-pytorch-soci-cluster/43ea53ea84154d5aa90f8fdd7414c6df   |
|  2023-07-07T17:43:53.318000+00:00|  2023-07-07T17:44:50.777000+00:00 |  arn:aws:ecs:ap-southeast-1:xyz:task/demo-pytorch-soci-cluster/0731bea30d42449e9006a5d8902756d5   |
+----------------------------------+-----------------------------------+------------------------------------------------------------------------------------------------------------+

Here, I see my container image with SOCI-enabled — pytorch-soci — was started 60 seconds after being created.

This means that running my sample application with SOCI indexes on AWS Fargate is approximately 50 percent faster compared to running without SOCI indexes.

It’s recommended to benchmark the startup and scaling-out time of your application with and without SOCI. This helps you to have a better understanding of how your application behaves and if your applications benefit from AWS Fargate support for SOCI.

Customer Voices
During the private preview period, we heard lots of feedback from our customers about AWS Fargate support for SOCI. Here’s what our customers say:

Autodesk provides critical design, make, and operate software solutions across the architecture, engineering, construction, manufacturing, media, and entertainment industries. “SOCI has given us a 50% improvement in startup performance for our time-sensitive simulation workloads running on Amazon ECS with AWS Fargate. This allows our application to scale out faster, enabling us to quickly serve increased user demand and save on costs by reducing idle compute capacity. The AWS Partner Solution for creating the SOCI index is easy to configure and deploy.” – Boaz Brudner, Head of Innovyze SaaS Engineering, AI and Architecture, Autodesk.

Flywire is a global payments enablement and software company, on a mission to deliver the world’s most important and complex payments. “We run multi-step deployment pipelines on Amazon ECS with AWS Fargate which can take several minutes to complete. With SOCI, the total pipeline duration is reduced by over 50% without making any changes to our applications, or the deployment process. This allowed us to drastically reduce the rollout time for our application updates. For some of our larger images of over 750MB, SOCI improved the task startup time by more than 60%.”, Samuel Burgos, Sr. Cloud Security Engineer, Flywire.

Virtuoso is a leading software corporation that makes functional UI and end-to-end testing software. “SOCI has helped us reduce the lag between demand and availability of compute. We have very bursty workloads which our customers expect to start as fast as possible. SOCI helps our ECS tasks spin-up 40% faster, allowing us to quickly scale our application and reduce the pool of idle compute capacity, enabling us to deliver value more efficiently. Setting up SOCI was really easy. We opted to use the quick-start AWS Partner’s solution with which we could leave our build and deployment pipelines untouched.”, Mathew Hall, Head of Site Reliability Engineering, Virtuoso.

Things to Know
Availability — AWS Fargate support for SOCI is available in all AWS Regions where Amazon ECS, AWS Fargate, and Amazon ECR are available.

Pricing — AWS Fargate support for SOCI is available at no additional cost and you will only be charged for storing the SOCI indexes in Amazon ECR.

Get Started — Learn more about benefits and how to get started on the AWS Fargate Support for SOCI page.

Happy building.
— Donnie

Kata Containers: Virtual Machines that feel and perform like containers

2023-07-17T12:55:36.000Z

Ask HN: How to price your first Enterprise customer?

2023-07-15T13:56:55.000Z

Lies, damned lies: debunking Cloudflare’s recent performance tests

2023-07-14T23:44:11.000Z

Debunking Cloudflare’s recent performance tests (2021)

2023-07-14T23:44:11.000Z

Show HN: Open-source IAM Ory Kratos v1.0 with Passkeys, MFA and multi-region

2023-07-13T13:37:21.000Z

Figma Is a File Editor

2023-07-12T23:13:09.000Z

Istio moved to CNCF Graduation stage

2023-07-12T16:57:51.000Z

Inngest raises $3M seed to build the reliable workflow platform for every dev

2023-07-12T16:11:53.000Z

Commit Mono – Neutral programming typeface

2023-07-12T02:25:48.000Z

HTTP vs. WebSockets: Which one is the fastest for Postgres queries at the edge?

2023-07-11T15:06:42.000Z

mirrord as an alternative to Telepresence

2023-07-11T13:11:52.000Z

SUSE is forking RHEL

2023-07-11T08:41:22.000Z

MicroVM by QEMU

2023-07-10T22:35:26.000Z

Feature Flags: Theory vs. Reality

2023-07-10T15:02:27.000Z

EdgeNode: Deploy Serverless Containers on Edge

2023-07-10T14:17:52.000Z

We replaced Firecracker with QEMU

2023-07-10T14:15:04.000Z

InfluxDB Cloud shuts down in Belgium; some weren't notified before data deletion

2023-07-09T19:11:00.000Z

The $5 Plan

2023-07-07T01:40:51.000Z

PlanetScale Scaler Pro

2023-07-06T15:23:21.000Z

Fastmail 30 June outage post-mortem

2023-07-06T14:08:23.000Z

DigitalOcean acquires Paperspace (YC W15) for $111M in cash

2023-07-06T12:45:20.000Z

Cloud Native Software Engineering

2023-07-05T05:01:06.000Z

June 25th, 2023 Deno Deploy Postmortem

2023-07-01T11:44:53.000Z

Scaling Linear's Sync Engine

2023-06-29T12:23:34.000Z

Noticing when an app is only hosted in us-east-1

2023-06-28T19:03:04.000Z

Show HN: Serverless VPN, pay as you go, unlimited devices, no subscriptions

2023-06-28T18:43:03.000Z

Fast machines, slow machines

2023-06-28T08:53:14.000Z

How not to write a pipeline

2023-06-28T02:24:00.000Z

Docker Acquires Mutagen

2023-06-27T17:31:52.000Z

InfluxDB 3.0 System Architecture

2023-06-27T17:17:37.000Z

Wolfi: A community Linux OS designed for the container and cloud-native era

2023-06-27T07:19:37.000Z

Cloud, Why So Difficult?

2023-06-27T04:01:04.000Z

Testing a 1,000 player Minecraft server with Folia

2023-06-27T03:13:53.000Z

Build Your Own Docker with Linux Namespaces, Cgroups, and Chroot

2023-06-27T02:50:55.000Z

The Magic Nix Cache, a GitHub Action for speeding up your Nix workflows

2023-06-26T19:32:38.000Z

Launch HN: Argonaut (YC S21) – Easily Deploy Apps and Infra to AWS and GCP

2023-06-26T14:41:47.000Z

Working with Docker Containers Made Easy with the Dexec Bash Script

2023-06-26T12:29:04.000Z

Why Google Zanzibar shines at building authorization

2023-06-25T17:59:10.000Z

Launching val.run: subdomains & arbitrary paths

2023-06-23T15:56:52.000Z

“WebAssembly runtimes will replace container-based runtimes by 2030”

2023-06-22T16:40:13.000Z

Red Hat cutting back RHEL source availability

2023-06-21T16:00:14.000Z

System Initiative: Second Wave DevOps

2023-06-21T15:13:24.000Z

I won't pay on your website

2023-06-21T06:33:33.000Z

Render Raises $50M Series B

2023-06-20T14:58:08.000Z

Show HN: Inngest – Developer platform for background jobs and workflows

2023-06-20T12:24:08.000Z

Review of Hetzner ARM64 Servers and Experience of WebP Cloud Services on Them

2023-06-17T09:14:22.000Z

London Underground Dot Matrix Typeface

2023-06-17T05:52:08.000Z

Nvidia H100 and A100 GPUs – comparing available capacity at GPU cloud providers

2023-06-14T21:56:21.000Z

Vercel's AI Accelerator

2023-06-14T16:00:03.000Z

Migrating from Heroku to EKS

2023-06-14T15:22:01.000Z

Mental Liquidity

2023-06-11T12:31:34.000Z

Nix/NixOS S3 Update and Recap of Community Call

2023-06-09T17:56:08.000Z

Apollo Back end just made public

2023-06-09T11:19:58.000Z

Read 2 remaining paragraphs | Comments

Today I stumbled upon Microsoft’s 4K rendering of the Windows XP wallpaper

2023-06-08T20:22:45.000Z

Enlarge / A high-res rendered hill inspired by Windows XP's familiar "Bliss" wallpaper (visit Microsoft's site to get it at full resolution.) (credit: Microsoft)

Did you read the news about the Windows XP activation algorithm getting cracked and suddenly get nostalgic for the blue skies and bluer taskbar of that old Windows release? Or maybe you just like attractive, high-resolution desktop wallpapers and you want to make a change? It turns out that Microsoft's design team has rendered an updated 4K version of the default Windows XP wallpaper—you might know it by its name, "Bliss."

It's one of several retro-themed wallpapers on this Microsoft Design site, including photorealistic renderings of Solitaire, Paint, and (of course) Clippy. The site has been around for a while and hasn't been updated since December 2022, but Windows engineer Jennifer Gentleman tweeted about it yesterday—it's new to me and maybe to you, too. The most recent wallpapers appear to be products of Microsoft's Design Week event.

Among others, the Microsoft Design site also hosts the default wallpapers that have come with several Surface PCs, quite a few Pride Month-themed wallpaper designs, and several images focused on the company's recent emoji redesigns and the icons for the Microsoft 365 apps.

Apple Releases New Static Linker

2023-06-06T19:59:31.000Z

Show HN: Dockerless, Elixir Web Application Using Podman and Plug

2023-06-06T06:06:26.000Z

eBPF for Cybersecurity – Part 2

2023-06-05T14:11:05.000Z

Emulating PPC64 Inside Docker

2023-06-05T02:49:07.000Z

eBPF for Cybersecurity – Part 1

2023-06-04T21:36:39.000Z

Crun: Fast and lightweight OCI runtime and C library for running containers

2023-06-04T20:09:09.000Z

The NixOS Foundation’s Call to Action: S3 Costs Require Community Support

2023-06-03T02:10:26.000Z

My Approach to Building Large Technical Projects – Mitchell Hashimoto

2023-06-02T05:32:39.000Z

Zuckerberg unveils Meta’s newest VR headset days before Apple reveals its own

2023-06-01T14:12:40.000Z

Show HN: Meltano Cloud (GitLab spinout) – Managed infra for open source ELT

2023-06-01T14:09:25.000Z

OpenAPI v4 (aka Moonwalk) Proposal

2023-05-31T21:38:04.000Z

Show HN: Open Fire Serverless CI

2023-05-27T21:50:11.000Z

[Whitepaper] 9 Ways to Prevent a Supply Chain Attack on Your CI/CD Server

2023-05-25T14:07:47.000Z

CI/CD servers are high-value targets for attackers because of their central role in critical development processes. They provide access to source code, a valuable asset for software companies, and can deploy code to production environments, creating serious risks if not adequately secured. Even a single vulnerability can enable attackers to compromise the supply chain, inject malware, and seize control of systems.

According to “The State of Software Supply Chain Security 2023”, this has led to a rise in supply chain attacks since 2020, and 57% of organizations have suffered security incidents related to DevOps toolchain exposures.

To avoid data breaches and business disruptions, securing CI/CD servers should be a top priority. Furthermore, Google’s “2022 Accelerate State of DevOps Report” suggests that implementing proper security controls can have a positive impact on software delivery performance.

In this whitepaper, we present 9 effective ways to prevent a supply chain attack on your CI/CD server, providing practical guidance and best practices to help you strengthen security and protect critical development processes.

By implementing these strategies, you can minimize the risk of a supply chain attack and ensure the integrity, availability, and confidentiality of your software supply chain.

Download the whitepaper

Deno 1.34: Deno compile supports NPM packages

2023-05-25T10:16:00.000Z

Man Cures Acid Reflux By Eating Upside-Down

2023-05-24T15:12:17.000Z

Microsoft enables booting PCs directly into cloud PCs

2023-05-24T07:11:40.000Z

Podman Desktop 1.0

2023-05-23T15:59:13.000Z

How to host your side-projects for free in 2023: from Auth to Database

2023-05-22T21:20:52.000Z

Etrian Odyssey Origins Collection New Trailer Showcases Difficulty Options, Auto Mapping, Improved Graphics, and More

2023-05-22T13:58:00.000Z

Etrian Odyssey Origins Collection new trailer showcases difficulty options, auto mapping, improved graphics, and more for Switch and Steam.

To read Etrian Odyssey Origins Collection New Trailer Showcases Difficulty Options, Auto Mapping, Improved Graphics, and More in full, please visit The Mako Reactor. Thank you.

Growing Up Alyssa

2023-05-22T13:09:12.000Z

Polar Night

2023-05-22T09:15:20.000Z

Benchmarking Intel, AMD and Graviton Using Erasure Coding Workloads

2023-05-21T13:23:54.000Z

Dwgd: Docker WireGuard Driver

2023-05-20T20:19:45.000Z

Solving Intermittent Cable Modem Issues

2023-05-20T09:22:21.000Z

Debugging a FUSE deadlock in the Linux kernel at Netflix

2023-05-20T06:17:47.000Z

Devex: What actually drives productivity

2023-05-19T20:34:22.000Z

Workers Browser Rendering API enters open beta

2023-05-19T13:00:32.000Z

The Workers Browser Rendering API allows developers to programmatically control and interact with a headless browser instance and create automation flows for their applications and products.

Since the private beta announcement, based on the feedback we've been receiving and our own roadmap, the team has been working on the developer experience and improving the platform architecture for the best possible performance and reliability. Today we enter the open beta and will start onboarding the customers on the wait list.

Developer experience

Starting today, Wrangler, our command-line tool for configuring, building, and deploying applications with Cloudflare developer products, has support for the Browser Rendering API bindings.

You can install Wrangler Beta using npm:

npm install wrangler --save-dev

Bindings allow your Workers to interact with resources on the Cloudflare developer platform. In this case, they will provide your Worker script with an authenticated endpoint to interact with a dedicated Chromium browser instance.

This is all you need in your wrangler.toml once this service is enabled for your account:

browser = { binding = "MYBROWSER", type = "browser" }

Now you can deploy any Worker script that requires Browser Rendering capabilities. You can spawn Chromium instances and interact with them programmatically in any way you typically do manually behind your browser.

Under the hood, the Browser Rendering API gives you access to a WebSocket endpoint that speaks the DevTools Protocol. DevTools is what allows us to instrument a Chromium instance running in our global network, and it's the same protocol that Chrome uses on your computer when you inspect a page.

With enough dedication, you can, in fact, implement your own DevTools client and talk the protocol directly. But that'd be crazy; almost no one does that.

So…

Puppeteer

Puppeteer is one of the most popular libraries that abstract the lower-level DevTools protocol from developers and provides a high-level API that you can use to easily instrument Chrome/Chromium and automate browsing sessions. It's widely used for things like creating screenshots, crawling pages, and testing web applications.

Puppeteer typically connects to a local Chrome or Chromium browser using the DevTools port.

We forked a version of Puppeteer and patched it to connect to the Workers Browser Rendering API instead. The changes are minimal; after connecting the developers can then use the full Puppeteer API as they would on a standard setup.

Our version is open sourced here, and the npm can be installed from npmjs as @cloudflare/puppeteer. Using it from a Worker is as easy as:

import puppeteer from "@cloudflare/puppeteer";

And then all it takes to launch a browser from your script is:

const browser = await puppeteer.launch(env.MYBROWSER);

In the long term, we will update Puppeteer to keep matching the version of our Chromium instances infrastructure running in our network.

Developer documentation

Following the tradition with other Developer products, we created a dedicated section for the Browser Rendering APIs in our Developer's Documentation site.

You can access this page to learn more about how the service works, Wrangler support, APIs, and limits, and find examples of starter templates for common applications.

An example application: taking screenshots

Taking screenshots from web pages is one of the typical cases for browser automation.

Let's create a Worker that uses the Browser Rendering API to do just that. This is a perfect example of how to set up everything and get an application running in minutes, it will give you a good overview of the steps involved and the basics of the Puppeteer API, and then you can move from here to other more sophisticated use-cases.

Step one, start a project, install Wrangler and Cloudflare’s fork of Puppeteer:

npm init -f
npm install wrangler -save-dev
npm install @cloudflare/puppeteer -save-dev

Step two, let’s create the simplest possible wrangler.toml configuration file with the Browser Rendering API binding:

name = "browser-worker"
main = "src/index.ts"
compatibility_date = "2023-03-14"
node_compat = true
workers_dev = true

browser = { binding = "MYBROWSER", type = "browser" }

Step three, create src/index.ts with your Worker code:

import puppeteer from "@cloudflare/puppeteer";

export default {
    async fetch(request: Request, env: Env): Promise {
        const { searchParams } = new URL(request.url);
        let url = searchParams.get("url");
        let img: Buffer;
        if (url) {
            const browser = await puppeteer.launch(env.MYBROWSER);
            const page = await browser.newPage();
            await page.goto(url);
            img = (await page.screenshot()) as Buffer;
            await browser.close();
            return new Response(img, {
                headers: {
                    "content-type": "image/jpeg",
                },
            });
        } else {
            return new Response(
                "Please add the ?url=https://example.com/ parameter"
            );
        }
    },
};

That's it, no more steps. This Worker instantiates a browser using Puppeteer, opens a new page, navigates to whatever you put in the "url" parameter, takes a screenshot of the page, closes the browser, and responds with the JPEG image of the screenshot. It can't get any easier to get started with the Browser Rendering API.

Run npx wrangler dev –remote to test it and npx wrangler publish when you’re done.

You can explore the entire Puppeteer API and implement other functionality and logic from here. And, because it's Workers, you can add other developer products to your code. You might need a relational database, or a KV store to cache your screenshots, or an R2 bucket to archive your crawled pages and assets, or maybe use a Durable Object to keep your browser instance alive and share it with multiple requests, or queues to handle your jobs asynchronous, we have all of this and more.

You can also find this and other examples of how to use Browser Rendering in the Developer Documentation.

How do we use Browser Rendering

Dogfooding our products is one of the best ways to test and improve them, and in some cases, our internal needs dictate or influence our roadmap. Workers Browser Rendering is a good example of that; it was born out of our necessities before we realized it could be a product. We've been using it extensively for things like taking screenshots of pages for social sharing or dashboards, testing web software in CI, or gathering page load performance metrics of our applications.

But there's one product we've been using to stress test and push the limits of the Browser Rendering API and drive the engineering sprints that brought us to open the beta to our customers today: The Cloudflare Radar URL Scanner.

The URL Scanner scans any URL and compiles a full report containing technical, performance, privacy, and security details about that page. It's processing thousands of scans per day currently. It was built on top of Workers and uses a combination of the Browser Rendering APIs with Puppeteer to create enriched HAR archives and page screenshots, Durable Objects to reuse browser instances, Queues to handle customers' load and execute jobs asynchronously, and R2 to store the final reports.

This tool will soon have its own "how we built it" blog. Still, we wanted to let you know about it now because it is a good example of how you can build sophisticated applications using Browser Rendering APIs at scale starting today.

Future plans

The team will keep improving the Browser Rendering API, but a few things are worth mentioning today.

First, we are looking into upstreaming the changes in our Puppeteer fork to the main project so that using the official library with the Cloudflare Workers Browser Rendering API becomes as easy as a configuration option.

Second, one of the reasons why we decided to expose the DevTools protocol bare naked in the Worker binding is so that it can support other browser instrumentalization libraries in the future. Playwright is a good example of another popular library that developers want to use.

And last, we are also keeping an eye on and testing WebDriver BiDi, a "new standard browser automation protocol that bridges the gap between the WebDriver Classic and CDP (DevTools) protocols." Click here to know more about the status of WebDriver BiDi.

Final words

The Workers Browser Rendering API enters open beta today. We will gradually be enabling the customers in the wait list in batches and sending them emails. We look forward to seeing what you will be building with it and want to hear from you.

As usual, you can talk to us on our Developers Discord or the Community forum; the team will be listening.

Watch on Cloudflare TV

Announcing Cloudflare Secrets Store

2023-05-18T13:00:50.000Z

This post is also available in 简体中文, 日本語, Deutsch, Français and Español.

We’re excited to announce Secrets Store - Cloudflare’s new secrets management offering!

A secrets store does exactly what the name implies - it stores secrets. Secrets are variables that are used by developers that contain sensitive information - information that only authorized users and systems should have access to.

If you’re building an application, there are various types of secrets that you need to manage. Every system should be designed to have identity & authentication data that verifies some form of identity in order to grant access to a system or application. One example of this is API tokens for making read and write requests to a database. Failure to store these tokens securely could lead to unauthorized access of information - intentional or accidental.

The stakes with secret’s management are high. Every gap in the storage of these values has potential to lead to a data leak or compromise. A security administrator’s worst nightmare.

Developers are primarily focused on creating applications, they want to build quickly, they want their system to be performant, and they want it to scale. For them, secrets management is about ease of use, performance, and reliability. On the other hand, security administrators are tasked with ensuring that these secrets remain secure. It’s their responsibility to safeguard sensitive information, ensure that security best practices are met, and to manage any fallout of an incident such as a data leak or breach. It’s their job to verify that developers at their company are building in a secure and foolproof manner.

In order for developers to build at high velocity and for security administrators to feel at ease, companies need to adopt a highly reliable and secure secrets manager. This should be a system that ensures that sensitive information is stored with the highest security measures, while maintaining ease of use that will allow engineering teams to efficiently build.

Why Cloudflare is building a secrets store

Cloudflare’s mission is to help build a better Internet - that means a more secure Internet. We recognize our customers’ need for a secure, centralized repository for storing sensitive data. Within the Cloudflare ecosystem, are various places where customers need to store and access API and authorization tokens, shared secrets, and sensitive information. It’s our job to make it easy for customers to manage these values securely.

The need for secrets management goes beyond Cloudflare. Customers have sensitive data that they manage everywhere - at their cloud provider, on their own infrastructure, across machines. Our plan is to make our Secrets Store a one-stop shop for all of our customer’s secrets.

The evolution of secrets at Cloudflare

In 2020, we launched environment variables and secrets for Cloudflare Workers, allowing customers to create and encrypt variables across their Worker scripts. By doing this, developers can obfuscate the value of a variable so that it’s no longer available in plaintext and can only be accessed by the Worker.

Adoption and use of these secrets is quickly growing. We now have more than three million Workers scripts that reference variables and secrets managed through Cloudflare. One piece of feedback that we continue to hear from customers is that these secrets are scoped too narrowly.

Today, customers can only use a variable or secret within the Worker that it’s associated with. Instead, customers have secrets that they share across Workers. They don’t want to re-create those secrets and focus their time on keeping them in sync. They want account level secrets that are managed in one place but are referenced across multiple Workers scripts and functions.

Outside of Workers, there are many use cases for secrets across Cloudflare services.

Inside our Web Application Firewall (WAF), customers can make rules that look for authorization headers in order to grant or deny access to requests. Today, when customers create these rules, they put the authorization header value in plaintext, so that anyone with WAF access in the Cloudflare account can see its value. What we’ve heard from our customers is that even internally, engineers should not have access to this type of information. Instead, what our customers want is one place to manage the value of this header or token, so that only authorized users can see, create, and rotate this value. Then when creating a WAF rule, engineers can just reference the associated secret e.g.“account.mysecretauth”. By doing this, we help our customers secure their system by reducing the access scope and enhance management of this value by keeping it updated in one place.

With new Cloudflare products and features quickly developing, we’re hearing more and more use cases for a centralized secrets manager. One that can be used to store Access Service tokens or shared secrets for Webhooks.

With the new account level Secrets Store, we’re excited to give customers the tools they need to manage secrets across Cloudflare services.

Securing the Secret Store

To have a secrets store, there are a number of measures that need to be in place, and we’re committing to providing these for our customers.

First, we’re going to give the tools that our customers need to restrict access to secrets. We will have scope permissions that will allow admins to choose which users can view, create, edit, or remove secrets. We also plan to add the same level of granularity to our services - giving customers the ability to say “only allow this Worker to access this secret and only allow this set of Firewall rules to access that secret”.

Next, we’re going to give our customers extensive audits that will allow them to track the access and use of their secrets. Audit logs are crucial for security administrators. They can be used to alert team members that a secret was used by an unauthorized service or that a compromised secret is being accessed when it shouldn’t be. We will give customers audit logs for every secret-related event, so that customers can see exactly who is making changes to secrets and which services are accessing and when.

In addition to the built-in security of the Secrets Store, we’re going to give customers the tools to rotate their encryption keys on-demand or at a cadence that fits the right security posture for them.

We’re excited to get the Secrets Store in our customer’s hands. If you’re interested in using this, please fill out this form, and we’ll reach out to you when it’s ready to use.

Watch on Cloudflare TV

The Staff Engineer's Path – Book Review

2023-05-17T13:18:48.000Z

Retro Computer Museum – Leicester, UK

2023-05-17T12:02:42.000Z

Photomator for Mac

2023-05-17T06:52:08.000Z

Show HN: Oblivus GPU Cloud – Affordable and scalable GPU servers from $0.29/hr

2023-05-16T07:30:18.000Z

Brex’s Prompt Engineering Guide

2023-05-15T00:12:10.000Z

High Performance Browser Networking

2023-05-14T12:47:47.000Z

MacOS networkQuality

2023-05-14T10:45:08.000Z

Setting up Hetzner ARM instances with and for Objective-S

2023-05-14T00:45:06.000Z

Show HN: Automatic Domain Verification

2023-05-08T17:06:01.000Z

Read 26 remaining paragraphs | Comments

Google passkeys are a no-brainer. You’ve turned them on, right?

2023-05-08T13:50:21.000Z

Enlarge (credit: Aurich Lawson | Getty Images)

By now, you’ve likely heard that passwordless Google accounts have finally arrived. The replacement for passwords is known as "passkeys."

There are many misconceptions about passkeys, both in terms of their usability and the security and privacy benefits they offer compared with current authentication methods. That’s not surprising, given that passwords have been in use for the past 60 years, and passkeys are so new. The long and short of it is that with a few minutes of training, passkeys are easier to use than passwords, and in a matter of months—once a dozen or so industry partners finish rolling out the remaining pieces—using passkeys will be easier still. Passkeys are also vastly more secure and privacy-preserving than passwords, for reasons I'll explain later.

What is a passkey anyway?

This article provides a primer to get people started with Google's implementation of passkeys and explains the technical underpinnings that make them a much easier and more effective way to protect against account takeovers. A handful of smaller sites—specifically, PayPal, Instacart, Best Buy, Kayak, Robinhood, Shop Pay, and Cardpointers—have rolled out various options for logging in with passkeys, but those choices are more proofs of concept than working solutions. Google is the first major online service to make passkeys available, and its offering is refined and comprehensive enough that I’m recommending people turn them on today.

A cryptocurrency company had a $65M bill, per Datadog’s Q1 earnings call

2023-05-06T02:17:01.000Z

Htmx Is the Future

2023-05-05T14:32:19.000Z

Next.js 13.4

2023-05-04T15:53:58.000Z

Prime Video Abandons Serverless Distributed Stack for Monolith, saves 90%

2023-05-04T06:04:56.000Z

DevOps capabilities

2023-05-02T09:09:16.000Z

Vercel Service Markup

2023-05-01T22:10:00.000Z

Show HN: Frogmouth – A Markdown browser for your terminal

2023-04-30T13:05:01.000Z

The State of Serverless GPUs

2023-04-28T06:21:27.000Z

OrbStack – Fast, lightweight Docker Desktop and Colima alternative for macOS

2023-04-28T04:21:02.000Z

An Intro to SBOMs

2023-04-24T17:01:13.000Z

I use Nix and make(1) to develop

2023-04-24T15:31:05.000Z

Why People Aren't Deploying to Vercel Anymore

2023-04-24T11:04:01.000Z

Ways to shoot yourself in the foot with PostgreSQL

2023-04-24T06:11:30.000Z

Standardizing BPF

2023-04-21T03:19:07.000Z

evidence.dev – Business Intelligence as Code

2023-04-20T20:01:27.000Z

Show HN: Open-source Auth0 alternative Ory Kratos v0.13 released – nearing v1.0

2023-04-19T14:30:09.000Z

Astral

2023-04-18T17:38:28.000Z

Changes to Tailscale Pricing and Plans

2023-04-18T16:04:41.000Z

DevOps Threat Matrix

2023-04-17T01:44:40.000Z

Google Assured OSS

2023-04-14T14:28:30.000Z

Surpassing 10Gb/S over Tailscale

2023-04-13T18:11:30.000Z

Hetzner Introduces ARM64 Cloud Servers

2023-04-12T14:17:46.000Z

Tailscale Sucks

2023-04-11T19:34:09.000Z

Swarm

2023-04-11T14:30:21.000Z

Supabase Edge Runtime: Self-Hosted Deno Functions

2023-04-11T13:57:45.000Z

What Archaeology Is Telling Us About the Real Jesus

2023-04-07T15:34:00.000Z

Buck2: Our open source build system

2023-04-06T16:05:51.000Z

Packet, where are you? – eBPF-based Linux kernel networking debugger

2023-04-05T06:11:32.000Z

Engineering with Enclaves

2023-04-03T23:36:13.000Z

Show HN: Hocus – self-hosted alternative to GitHub Codespaces using Firecracker

2023-04-03T17:00:42.000Z

Using Tailscale without using Tailscale

2023-04-01T12:47:10.000Z

The PlanetScale GitHub Actions

2023-03-31T20:38:01.000Z

Tailscale Funnel now available in beta

2023-03-30T15:31:48.000Z

Self-Service SBOMs

2023-03-30T12:00:33.000Z

Show HN: Open-Source Webhooks Gateway for Platform Engineers

2023-03-30T09:11:39.000Z

Ubuntu stops shipping Flatpak by default

2023-03-29T09:37:38.000Z

Docker-compose.yml as a universal infrastructure interface

2023-03-27T15:12:52.000Z

When root on ZFS breaks on Arch Linux

2023-03-26T12:57:03.000Z

Docker

2023-03-25T21:33:05.000Z

OpenAI CEO Sam Altman on Lex Fridman Podcast

2023-03-25T18:31:04.000Z

Docker, Docker Inc., Docker Hub, and their relation to the broader world of containerization

2023-03-25T14:43:46.000Z

Podman Desktop: Same functionality as Docker Desktop but open source

2023-03-24T21:53:57.000Z

Docker: We’re No Longer Sunsetting the Free Team Plan

2023-03-24T21:26:52.000Z

Anime dating sim that also does your taxes

2023-03-24T09:01:39.000Z

We updated our RSA SSH host key

2023-03-24T05:28:58.000Z

Use the XDG Base Directory Specification

2023-03-24T01:37:08.000Z

MRSK vs. Fly.io

2023-03-22T16:56:58.000Z

De-cloud and de-k8s – bringing our apps back home

2023-03-22T16:12:13.000Z

MySQL for Developers

2023-03-21T15:05:11.000Z

Building ClickHouse Cloud from scratch in a year

2023-03-20T21:12:42.000Z

Docker’s Bad Week

2023-03-20T17:59:12.000Z

20 Years of Nix

2023-03-18T12:09:59.000Z

Warp AI – AI directly integrated into the terminal

2023-03-16T16:05:44.000Z

Citymapper Joins Via

2023-03-16T14:12:51.000Z

Show HN: Chainloop, A Software Supply Chain Attestation solution devs won't hate

2023-03-16T11:34:08.000Z

Fly.io outage, recently deployed apps down, deployments disabled

2023-03-15T20:54:36.000Z

Zed, the new code editor from Atom developers, has entered open beta

2023-03-15T17:23:17.000Z

Modern Font Stacks

2023-03-15T14:16:42.000Z

Elixir: Docker now charges open source orgs $300

2023-03-15T11:28:27.000Z

Docker is deleting Open Source organisations - what you need to know

2023-03-15T10:57:11.000Z

Mountpoint – file client for S3 written in Rust, from AWS

2023-03-14T18:12:30.000Z

Show HN: Modern Font Stacks – New system font stack CSS for modern OSs

2023-03-14T12:24:48.000Z

Audiophile forum debating which versions of memcpy had the highest sound quality (2013)

2023-03-13T15:44:54.000Z

Disambiguating Arm, Arm ARM, ARMv9, ARM9, ARM64, AArch64, A64, A78, ...

2023-03-10T20:16:57.000Z

Oxy is Cloudflare's Rust-based next generation proxy framework

2023-03-10T16:35:17.000Z

Launch HN: Defer (YC W23) – Zero-infrastructure background jobs for Node.js

2023-03-10T16:16:28.000Z

Ahrefs Saved US$400M in 3 Years by Not Going to the Cloud

2023-03-10T14:04:19.000Z

Flux Keyboard

2023-03-09T17:15:03.000Z

Cachix 1.3: Nix uploads unleashed

2023-03-08T10:12:00.000Z

Cachix 1.3: Uploads unleashed

2023-03-08T10:12:00.000Z

Microphones

2023-03-08T09:23:59.000Z

How we deploy faster with warm Docker containers

2023-03-08T01:42:39.000Z

Remote Code Execution as a Service

2023-03-07T19:29:49.000Z

Globally Distributed Elixir over Tailscale

2023-03-07T19:12:49.000Z

Framework-Defined Infrastructure

2023-03-07T16:17:04.000Z

Want an unfair advantage in your tech career? Consume content for other roles

2023-03-06T12:27:06.000Z

GitHub Packages Is Down

2023-03-01T13:07:29.000Z

Some notes on using nix

2023-02-28T23:16:17.000Z

Recently I started using a Mac for the first time. The biggest downside I’ve noticed so far is that the package management is much worse than on Linux. At some point I got frustrated with homebrew because I felt like it was spending too much time upgrading when I installed new packages, and so I thought – maybe I’ll try the nix package manager!

nix has a reputation for being confusing (it has its whole own programming language!), so I’ve been trying to figure out how to use nix in a way that’s as simple as possible and does not involve managing any configuration files or learning a new programming language. Here’s what I’ve figured out so far! We’ll talk about how to:

install packages with nix
build a custom nix package for a C++ program called paperjam
install a 5-year-old version of hugo with nix

As usual I’ve probably gotten some stuff wrong in this post since I’m still pretty new to nix. I’m also still not sure how much I like nix – it’s very confusing! But it’s helped me compile some software that I was struggling to compile otherwise, and in general it seems to install things faster than homebrew.

what’s interesting about nix?

People often describe nix as “declarative package management”. I don’t care that much about declarative package management, so here are two things that I appreciate about nix:

It provides binary packages (hosted at cache.nixos.org/) that you can quickly download and install
For packages which don’t have binary packages, it makes it easier to compile them

I think that the reason nix is good at compiling software is that:

you can have multiple versions of the same library or program installed at a time (you could have 2 different versions of libc for instance). For example I have two versions of node on my computer right now, one at /nix/store/4ykq0lpvmskdlhrvz1j3kwslgc6c7pnv-nodejs-16.17.1 and one at /nix/store/5y4bd2r99zhdbir95w5pf51bwfg37bwa-nodejs-18.9.1.
when nix builds a package, it builds it in isolation, using only the specific versions of its dependencies that you explicitly declared. So there’s no risk that the package secretly depends on another package on your system that you don’t know about. No more fighting with LD_LIBRARY_PATH!
a lot of people have put a lot of work into writing down all of the dependencies of packages

I’ll give a couple of examples later in this post of two times nix made it easier for me to compile software.

how I got started with nix

here’s how I got started with nix:

Install nix. I forget exactly how I did this, but it looks like there’s an official installer and an unofficial installer from zero-to-nix.com. The instructions for uninstalling nix on MacOS with the standard multi-user install are a bit complicated, so it might be worth choosing an installation method with simpler uninstall instructions.
Put ~/.nix-profile/bin on my PATH
Install packages with nix-env -iA nixpkgs.NAME
That’s it.

Basically the idea is to treat nix-env -iA like brew install or apt-get install.

For example, if I want to install fish, I can do that like this:

nix-env -iA nixpkgs.fish

This seems to just download some binaries from cache.nixos.org – pretty simple.

Some people use nix to install their Node and Python and Ruby packages, but I haven’t been doing that – I just use npm install and pip install the same way I always have.

some nix features I’m not using

There are a bunch of nix features/tools that I’m not using, but that I’ll mention. I originally thought that you had to use these features to use nix, because most of the nix tutorials I’ve read talk about them. But you don’t have to use them.

NixOS (a Linux distribution)
nix-shell
nix flakes
home-manager
devenv.sh

I won’t go into these because I haven’t really used them and there are lots of explanations out there.

where are nix packages defined?

I think packages in the main nix package repository are defined in github.com/NixOS/nixpkgs/

It looks like you can search for packages at search.nixos.org/packages. The two official ways to search packages seem to be:

nix-env -qaP NAME, which is very extremely slow and which I haven’t been able to get to actually work
nix --extra-experimental-features 'nix-command flakes' search nixpkgs NAME, which does seem to work but is kind of a mouthful. Also all of the packages it prints out start with legacyPackages for some reason

I found a way to search nix packages from the command line that I liked better:

Run nix-env -qa '*' > nix-packages.txt to get a list of every package in the Nix repository
Write a short nix-search script that just greps packages.txt (cat ~/bin/nix-packages.txt | awk '{print $1}' | rg "$1")

everything is installed with symlinks

One of nix’s major design choices is that there isn’t one single bin with all your packages, instead you use symlinks. There are a lot of layers of symlinks. A few examples of symlinks:

~/.nix-profile on my machine is (indirectly) a symlink to /nix/var/nix/profiles/per-user/bork/profile-111-link/
~/.nix-profile/bin/fish is a symlink to /nix/store/afkwn6k8p8g97jiqgx9nd26503s35mgi-fish-3.5.1/bin/fish

When I install something, it creates a new profile-112-link directory with new symlinks and updates my ~/.nix-profile to point to that directory.

I think this means that if I install a new version of fish and I don’t like it, I can easily go back just by running nix-env --rollback – it’ll move me to my previous profile directory.

uninstalling packages doesn’t delete them

If I uninstall a nix package like this, it doesn’t actually free any hard drive space, it just removes the symlinks.

$ nix-env --uninstall oil

I’m still not sure how to actually delete the package – I ran a garbage collection like this, which seemed to delete some things:

$ nix-collect-garbage
...
85 store paths deleted, 74.90 MiB freed

But I still have oil on my system at /nix/store/8pjnk6jr54z77jiq5g2dbx8887dnxbda-oil-0.14.0.

There’s a more aggressive version of nix-collect-garbage that also deletes old versions of your profiles (so that you can’t rollback)

$ nix-collect-garbage -d --delete-old

That doesn’t delete /nix/store/8pjnk6jr54z77jiq5g2dbx8887dnxbda-oil-0.14.0 either though and I’m not sure why.

upgrading

It looks like you can upgrade nix packages like this:

nix-channel --update
nix-env --upgrade

(similar to apt-get update && apt-get upgrade)

I haven’t really upgraded anything yet. I think that if something goes wrong with an upgrade, you can roll back (because everything is immutable in nix!) with

nix-env --rollback

Someone linked me to this post from Ian Henry that talks about some confusing problems with nix-env --upgrade – maybe it doesn’t work the way you’d expect? I guess I’ll be wary around upgrades.

next goal: make a custom package of `paperjam`

After a few months of installing existing packages, I wanted to make a custom package with nix for a program called paperjam that wasn’t already packaged.

I was actually struggling to compile paperjam at all even without nix because the version I had of libiconv I has on my system was wrong. I thought it might be easier to compile it with nix even though I didn’t know how to make nix packages yet. And it actually was!

But figuring out how to get there was VERY confusing, so here are some notes about how I did it.

how to build an example package

Before I started working on my paperjam package, I wanted to build an example existing package just to make sure I understood the process for building a package. I was really struggling to figure out how to do this, but I asked in Discord and someone explained to me how I could get a working package from github.com/NixOS/nixpkgs/ and build it. So here are those instructions:

step 1: Download some arbitrary package from nixpkgs on github, for example the dash package:

wget https://raw.githubusercontent.com/NixOS/nixpkgs/47993510dcb7713a29591517cb6ce682cc40f0ca/pkgs/shells/dash/default.nix -O dash.nix

step 2: Replace the first statement ({ lib , stdenv , buildPackages , autoreconfHook , pkg-config , fetchurl , fetchpatch , libedit , runCommand , dash }: with with import {}; I don’t know why you have to do this, but it works.

step 3: Run nix-build dash.nix

This compiles the package

step 4: Run nix-env -i -f dash.nix

This installs the package into my ~/.nix-profile

That’s all! Once I’d done that, I felt like I could modify the dash package and make my own package.

how I made my own package

paperjam has one dependency (libpaper) that also isn’t packaged yet, so I needed to build libpaper first.

Here’s libpaper.nix. I basically just wrote this by copying and pasting from other packages in the nixpkgs repository. My guess is what’s happening here is that nix has some default rules for compiling C packages (like “run make install”), so the make install happens default and I don’t need to configure it explicitly.

with import  {};

stdenv.mkDerivation rec {
  pname = "libpaper";
  version = "0.1";

  src = fetchFromGitHub {
    owner = "naota";
    repo = "libpaper";
    rev = "51ca11ec543f2828672d15e4e77b92619b497ccd";
    hash = "sha256-S1pzVQ/ceNsx0vGmzdDWw2TjPVLiRgzR4edFblWsekY=";
  };

  buildInputs = [ ];

  meta = with lib; {
    homepage = "https://github.com/naota/libpaper";
    description = "libpaper";
    platforms = platforms.unix;
    license = with licenses; [ bsd3 gpl2 ];
  };
}

Basically this just tells nix how to download the source from GitHub.

I built this by running nix-build libpaper.nix

Next, I needed to compile paperjam. Here’s a link to the nix package I wrote. The main things I needed to do other than telling it where to download the source were:

add some extra build dependencies (like asciidoc)
set some environment variables for the install (installFlags = [ "PREFIX=$(out)" ];) so that it installed in the correct directory instead of /usr/local/bin.

I set the hashes by first leaving the hash empty, then running nix-build to get an error message complaining about a mismatched hash. Then I copied the correct hash out of the error message.

I figured out how to set installFlags just by running rg PREFIX in the nixpkgs repository – I figured that needing to set a PREFIX was pretty common and someone had probably done it before, and I was right. So I just copied and pasted that line from another package.

Then I ran:

nix-build paperjam.nix
nix-env -i -f paperjam.nix

and then everything worked and I had paperjam installed! Hooray!

next goal: install a 5-year-old version of `hugo`

Right now I build this blog using Hugo 0.40, from 2018. I don’t need any new features so I haven’t felt a need to upgrade. On Linux this is easy: Hugo’s releases are a static binary, so I can just download the 5-year-old binary from the releases page and run it. Easy!

But on this Mac I ran into some complications. Mac hardware has changed in the last 5 years, so the Mac Hugo binary I downloaded crashed. And when I tried to build it from source with go build, that didn’t work either because Go build norms have changed in the last 5 years as well.

I was working around this by running Hugo in a Linux docker container, but I didn’t love that: it was kind of slow and it felt silly. It shouldn’t be that hard to compile one Go program!

Nix to the rescue! Here’s what I did to install the old version of Hugo with nix.

installing Hugo 0.40 with nix

I wanted to install Hugo 0.40 and put it in my PATH as hugo-0.40. Here’s how I did it. I did this in a kind of weird way, but it worked (Searching and installing old versions of Nix packages describes a probably more normal method).

step 1: Search through the nixpkgs repo to find Hugo 0.40

I found the .nix file here github.com/NixOS/nixpkgs/blob/17b2ef2/p…

step 2: Download that file and build it

I downloaded that file (and another file called deps.nix in the same directory), replaced the first line with with import {};, and built it with nix-build hugo.nix.

That almost worked without any changes, but I had to make two changes:

replace with stdenv.lib to with lib for some reason.
rename the package to hugo040 so that it wouldn’t conflict with the other version of hugo that I had installed

step 3: Rename hugo to hugo-0.40

I write a little post install script to rename the Hugo binary.

  postInstall = ''
    mv $out/bin/hugo $out/bin/hugo-0.40
  '';

I figured out how to run this by running rg 'mv ' in the nixpkgs repository and just copying and modifying something that seemed related.

step 4: Install it

I installed into my ~/.nix-profile/bin by running nix-env -i -f hugo.nix.

And it all works! I put the final .nix file into my own personal nixpkgs repo so that I can use it again later if I want.

reproducible builds aren’t magic, they’re really hard

I think it’s worth noting here that this hugo.nix file isn’t magic – the reason I can easily compile Hugo 0.40 today is that many people worked for a long time to make it possible to package that version of Hugo in a reproducible way.

that’s all!

Installing paperjam and this 5-year-old version of Hugo were both surprisingly painless and actually much easier than compiling it without nix, because nix made it much easier for me to compile the paperjam package with the right version of libiconv, and because someone 5 years ago had already gone to the trouble of listing out the exact dependencies for Hugo.

I don’t have any plans to get much more complicated with nix (and it’s still very possible I’ll get frustrated with it and go back to homebrew!), but we’ll see what happens! I’ve found it much easier to start in a simple way and then start using more features if I feel the need instead of adopting a whole bunch of complicated stuff all at once.

I probably won’t use nix on Linux – I’ve always been happy enough with apt (on Debian-based distros) and pacman (on Arch-based distros), and they’re much less confusing. But on a Mac it seems like it might be worth it. We’ll see! It’s very possible in 3 months I’ll get frustrated with nix and just go back to homebrew.

5-month update: rebuilding my nix profile

Update from 5 months in: nix is still going well, and I’ve only run into 1 problem, which is that every nix-env -iA package installation started failing with the error “bad meta.outputsToInstall”.

This script from Ross Light fixes that problem though. It lists every derivation installed in my current profile and creates a new profile with the exact same derivations. This feels like a nix bug (surely creating a new profile with the exact same derivations should be a no-op?) but I haven’t looked into it more yet.

From Go on EC2 to Fly.io

2023-02-27T05:13:09.000Z

Serving 250k Developers with One Support Engineer

2023-02-24T20:22:46.000Z

Deno 1.31: Package.json Support

2023-02-24T15:16:37.000Z

Camouflaging autistic traits linked to internalizing symptoms anxiety depression

2023-02-23T17:05:29.000Z

AI Companion Users Are in Crisis, Reporting Sudden Sexual Rejection

2023-02-22T17:37:30.000Z

Launch HN: Depot (YC W23) – Fast Docker Builds in the Cloud

2023-02-22T16:40:15.000Z

Show HN: We’re open-sourcing our session replay tool

2023-02-22T16:02:26.000Z

Soylent Acquired by Starco Brands

2023-02-22T05:02:09.000Z

Launch HN: Moonrepo (YC W23) – Open-source build system

2023-02-21T18:48:10.000Z

Certainly: Fastly’s Own TLS Certification Authority

2023-02-21T15:13:43.000Z

Passkeys for Infrastructure

2023-02-21T15:06:27.000Z

Podman 4.4

2023-02-20T22:55:04.000Z

Show HN: The SaaS 2.0 Manifesto

2023-02-20T17:46:19.000Z

Gitlab's Startup Acquisition Process

2023-02-20T10:32:35.000Z

A Docker footgun led to a vandal deleting NewsBlur's MongoDB database (2021)

2023-02-19T18:16:23.000Z

We cut our CI pipeline execution time in half

2023-02-17T15:50:53.000Z

Bret Fisher – Awesome-Swarm

2023-02-16T20:01:00.000Z

Homebrew 4.0.0

2023-02-16T11:00:57.000Z

Bigscreen VR announces Beyond: the world's smallest VR headset

2023-02-13T15:52:32.000Z

Databases on Kubernetes is fundamentally same as a database on a VM

2023-02-11T08:03:58.000Z

Ask HN: Are people considering moving off of Fly.io?

2023-02-10T17:39:10.000Z

Podman vs. Docker: Comparing the two containerization tools

2023-02-09T02:46:38.000Z

Read 7 remaining paragraphs | Comments

OnePlus unveils its first mechanical keyboard: Mac layout, custom switches

2023-02-07T19:10:42.000Z

Enlarge (credit: OnePlus)

OnePlus is finally ready to detail its first mechanical keyboard. No, we didn't need another company to start making mechanical keyboards. But if you're looking for a new Bluetooth keyboard that plays particularly well with Macs, has a compact layout, and a rotary knob that looks stylish and functional, OnePlus will have one more choice for you come April.

Announced today, OnePlus is jumping into the mechanical keyboard race with a strange name, the Featuring Keyboard 81 Pro. The "81" refers to the key count, while "Pro" is assumably meant to make workers and power users think the keyboard's a good fit; but the name doesn't quite roll off the tongue. The outlier here is the "Featuring" bit, which refers to the OnePlus Featuring "co-creation" platform that builds products based off user feedback. Community users are said to have contributed to the 81 Pro's design, including its proprietary switches. OnePlus' press release today claimed it will release "many" more Featuring products.

OnePlus-y touches include a logo-stamped Esc key. (credit: OnePlus/Twitter)

Another huge influence on the 81 Pro is keyboard-maker Keychron, which is said to have helped engineer the product. That includes its layout, which matches the layout of the Q1 Pro that Keychron is currently crowdfunding. In addition to macOS, the keyboard is supposed to work with Windows, Linux, and Android, OnePlus' press release said. The keyboard's product page also claims support with iOS. Similar to some wireless Keychron keyboards, like the Keychron K14, there's a toggle on the keyboard's side for switching from Mac to Windows. Considering the lack of USB-A ports among Macs, the Bluetooth 5.1 keyboard charges over a USB-C to USB-C cable (there's also a USB-C to USB-A adapter).

Microsoft announces new Bing and Edge browser powered by upgraded ChatGPT AI

2023-02-07T18:27:42.000Z

How Cloudflare erroneously throttled a customer’s web traffic

2023-02-07T18:20:49.000Z

Over the years when Cloudflare has had an outage that affected our customers we have very quickly blogged about what happened, why, and what we are doing to address the causes of the outage. Today’s post is a little different. It’s about a single customer’s website not working correctly because of incorrect action taken by Cloudflare.

Although the customer was not in any way banned from Cloudflare, or lost access to their account, their website didn’t work. And it didn’t work because Cloudflare applied a bandwidth throttle between us and their origin server. The effect was that the website was unusable.

Because of this unusual throttle there was some internal confusion for our customer support team about what had happened. They, incorrectly, believed that the customer had been limited because of a breach of section 2.8 of our Self-Serve Subscription Agreement which prohibits use of our self-service CDN to serve excessive non-HTML content, such as images and video, without a paid plan that includes those services (this is, for example, designed to prevent someone building an image-hosting service on Cloudflare and consuming a huge amount of bandwidth; for that sort of use case we have paid image and video plans).

However, this customer wasn’t breaking section 2.8, and they were both a paying customer and a paying customer of Cloudflare Workers through which the throttled traffic was passing. This throttle should not have happened. In addition, there is and was no need for the customer to upgrade to some other plan level.

This incident has set off a number of workstreams inside Cloudflare to ensure better communication between teams, prevent such an incident happening, and to ensure that communications between Cloudflare and our customers are much clearer.

Before we explain our own mistake and how it came to be, we’d like to apologize to the customer. We realize the serious impact this had, and how we fell short of expectations. In this blog post, we want to explain what happened, and more importantly what we’re going to change to make sure it does not happen again.

Background

On February 2, an on-call network engineer received an alert for a congesting interface with Equinix IX in our Ashburn data center. While this is not an unusual alert, this one stood out for two reasons. First, it was the second day in a row that it happened, and second, the congestion was due to a sudden and extreme spike of traffic.

The engineer in charge identified the customer’s domain, tardis.dev, as being responsible for this sudden spike of traffic between Cloudflare and their origin network, a storage provider. Because this congestion happens on a physical interface connected to external peers, there was an immediate impact to many of our customers and peers. A port congestion like this one typically incurs packet loss, slow throughput and higher than usual latency. While we have automatic mitigation in place for congesting interfaces, in this case the mitigation was unable to resolve the impact completely.

The traffic from this customer went suddenly from an average of 1,500 requests per second, and a 0.5 MB payload per request, to 3,000 requests per second (2x) and more than 12 MB payload per request (25x).

The congestion happened between Cloudflare and the origin network. Caching did not happen because the requests were all unique URLs going to the origin, and therefore we had no ability to serve from cache.

A Cloudflare engineer decided to apply a throttling mechanism to prevent the zone from pulling so much traffic from their origin. Let's be very clear on this action: Cloudflare does not have an established process to throttle customers that consume large amounts of bandwidth, and does not intend to have one. This remediation was a mistake, it was not sanctioned, and we deeply regret it.

We lifted the throttle through internal escalation 12 hours and 53 minutes after having set it up.

What's next

To make sure a similar incident does not happen, we are establishing clear rules to mitigate issues like this one. Any action taken against a customer domain, paying or not, will require multiple levels of approval and clear communication to the customer. Our tooling will be improved to reflect this. We have many ways of traffic shaping in situations where a huge spike of traffic affects a link and could have applied a different mitigation in this instance.

We are in the process of rewriting our terms of service to better reflect the type of services that our customers deliver on our platform today. We are also committed to explaining to our users in plain language what is permitted under self-service plans. As a developer-first company with transparency as one of its core principles, we know we can do better here. We will follow up with a blog post dedicated to these changes later.

Once again, we apologize to the customer for this action and for the confusion it created for other Cloudflare customers.

The Flox Open Beta

2023-02-07T14:50:26.000Z

Show HN: Database of every VC investment memo

2023-02-07T13:56:24.000Z

Crafting container images without Dockerfiles

2023-02-06T14:54:32.000Z

Protocol Labs is laying off 21% of staff (89 people)

2023-02-04T14:07:19.000Z

Show HN: Webapp.io - Free firecracker-based full-stack hosting

2023-02-03T22:35:25.000Z

15+ Epic Games To Play After The Witcher 3: Wild Hunt

2023-02-03T19:45:00.000Z

So you’ve played, or re-played, the epic fantasy saga that is The Witcher 3: Wild Hunt and its two critically acclaimed expansions, Hearts of Stone and Blood and Wine. Now you’re on the hunt for your next fantasy epic, but which to choose?

Carving the scheduler out of our orchestrator

2023-02-02T16:30:28.000Z

Hermes, an Open Source Document Management System

2023-01-31T19:06:31.000Z

Weaponizing hyperfocus: Becoming the first DevRel at Tailscale

2023-01-31T16:17:09.000Z

Becoming the first DevRel at Tailscale

2023-01-31T16:17:09.000Z

Berkeley Mono Ligatures Release

2023-01-30T17:46:37.000Z

AWS Purity Test

2023-01-27T16:40:30.000Z

DevRel should be a process not a project

2023-01-25T09:25:39.000Z

Development Containers

2023-01-25T03:03:55.000Z

An Ideal CI/CD System

2023-01-25T02:07:23.000Z

Astro 2.0

2023-01-24T17:48:33.000Z

Kysely: TypeScript SQL Query Builder

2023-01-24T17:17:06.000Z

Elastic, Loki and SigNoz – A Perf Benchmark of Open-Source Logging Platforms

2023-01-24T08:03:07.000Z

Building Reliable Distributed Systems in Node.js

2023-01-21T00:06:43.000Z

The building blocks of great docs

2023-01-20T22:21:54.000Z

Show HN: Retool Mobile

2023-01-19T15:31:21.000Z

Use cookies and sessions (not JWTs) for authentication

2023-01-16T03:24:11.000Z

Sickcodes/Docker-OS X: Run macOS VM in a Docker

2023-01-13T22:20:27.000Z

Docker 2.0 went from $11M to $135M in 2 years

2023-01-13T18:49:42.000Z

Data pipelines are not workflows

2023-01-09T18:06:01.000Z

Sourcehut will blacklist the Go module mirror

2023-01-09T14:28:39.000Z

Weave your own global, private, virtual Zero Trust network on Cloudflare with WARP-to-WARP

2023-01-09T14:00:00.000Z

Millions of users rely on Cloudflare WARP to connect to the Internet through Cloudflare’s network. Individuals download the mobile or desktop application and rely on the Wireguard-based tunnel to make their browser faster and more private. Thousands of enterprises trust Cloudflare WARP to connect employees to our Secure Web Gateway and other Zero Trust services as they navigate the Internet.

We’ve heard from both groups of users that they also want to connect to other devices running WARP. Teams can build a private network on Cloudflare’s network today by connecting WARP on one side to a Cloudflare Tunnel, GRE tunnels, or IPSec tunnels on the other end. However, what if both devices already run WARP?

Starting today, we’re excited to make it even easier to build a network on Cloudflare with the launch of WARP-to-WARP connectivity. With a single click, any device running WARP in your organization can reach any other device running WARP. Developers can connect to a teammate's machine to test a web server. Administrators can reach employee devices to troubleshoot issues. The feature works with our existing private network on-ramps, like the tunnel options listed above. All with Zero Trust rules built in.

To get started, sign-up to receive early access to our closed beta. If you’re interested in learning more about how it works and what else we will be launching in the future, keep scrolling.

The bridge to Zero Trust

We understand that adopting a Zero Trust architecture can feel overwhelming at times. With Cloudflare One, our mission is to make Zero Trust prescriptive and approachable regardless of where you are on your journey today. To help users navigate the uncertain, we created resources like our vendor-agnostic Zero Trust Roadmap which lays out a battle-tested path to Zero Trust. Within our own products and services, we’ve launched a number of features to bridge the gap between the networks you manage today and the network you hope to build for your organization in the future.

Ultimately, our goal is to enable you to overlay your network on Cloudflare however you want, whether that be with existing hardware in the field, a carrier you already partner with, through existing technology standards like IPsec tunnels, or more Zero Trust approaches like WARP or Tunnel. It shouldn’t matter which method you chose to start with, the point is that you need the flexibility to get started no matter where you are in this journey. We call these connectivity options on-ramps and off-ramps.

A recap of WARP to Tunnel

The model laid out above allows users to start by defining their specific needs and then customize their deployment by choosing from a set of fully composable on and offramps to connect their users and devices to Cloudflare. This means that customers are able to leverage any of these solutions together to route traffic seamlessly between devices, offices, data centers, cloud environments, and self-hosted or SaaS applications.

One example of a deployment we’ve seen thousands of customers be successful with is what we call WARP-to-Tunnel. In this deployment, the on-ramp Cloudflare WARP ensures end-user traffic reaches Cloudflare’s global network in a secure and performant manner. The off-ramp Cloudflare Tunnel then ensures that, after your Zero Trust rules have been enforced, we have secure, redundant, and reliable paths to land user traffic back in your distributed, private network.

This is a great example of a deployment that is ideal for users that need to support public to private traffic flows (i.e. North-South)

But what happens when you need to support private to private traffic flows (i.e. East-West) within this deployment?

With WARP-to-WARP, connecting just got easier

Starting today, devices on-ramping to Cloudflare with WARP will also be able to off-ramp to each other. With this announcement, we’re adding yet another tool to leverage in new or existing deployments that provides users with stronger network fabric to connect users, devices, and autonomous systems.

This means any of your Zero Trust-enrolled devices will be able to securely connect to any other device on your Cloudflare-defined network, regardless of physical location or network configuration. This unlocks the ability for you to address any device running WARP in the exact same way you are able to send traffic to services behind a Cloudflare Tunnel today. Naturally, all of this traffic flows through our in-line Zero Trust services, regardless of how it gets to Cloudflare, and this new connectivity announced today is no exception.

To power all of this, we now track where WARP devices are connected to, in Cloudflare’s global network, the same way we do for Cloudflare Tunnel. Traffic meant for a specific WARP device is relayed across our network, using Argo Smart Routing, and piped through the transport that routes IP packets to the appropriate WARP device. Since this traffic goes through our Zero Trust Secure Web Gateway — allowing various types of filtering — it means we upgrade and downgrade traffic from purely routed IP packets to fully proxied TLS connections (as well as other protocols). In the case of using SSH to remotely access a colleague’s WARP device, this means that your traffic is eligible for SSH command auditing as well.

Get started today with these use cases

If you already deployed Cloudflare WARP to your organization, then your IT department will be excited to learn they can use this new connectivity to reach out to any device running Cloudflare WARP. Connecting via SSH, RDP, SMB, or any other service running on the device is now simpler than ever. All of this provides Zero Trust access for the IT team members, with their actions being secured in-line, audited, and pushed to your organization’s logs.

Or, maybe you are done with designing a new function of an existing product and want to let your team members check it out at their own convenience. Sending them a link with your private IP — assigned by Cloudflare — will do the job. Their devices will see your machine as if they were in the same physical network, despite being across the other side of the world.

The usefulness doesn’t end with humans on both sides of the interaction: the weekend has arrived, and you have finally set out to move your local NAS to a host provider where you run a virtual machine. By running Cloudflare WARP on it, similarly to your laptop, you can now access your photos using the virtual machine’s private IP. This was already possible with WARP to Tunnel; but with WARP-to-WARP, you also get connectivity in reverse direction, where you can have the virtual machine periodically rsync/scp files from your laptop as well. This means you can make any server initiate traffic towards the rest of your Zero Trust organization with this new type of connectivity.

What’s next?

This feature will be available on all plans at no additional cost. To get started with this new feature, add your name to the closed beta, and we’ll notify you once you’ve been enrolled. Then, you’ll simply ensure that at least two devices are enrolled in Cloudflare Zero Trust and have the latest version of Cloudflare WARP installed.

This new feature builds upon the existing benefits of Cloudflare Zero Trust, which include enhanced connectivity, improved performance, and streamlined access controls. With the ability to connect to any other device in their deployment, Zero Trust users will be able to take advantage of even more robust security and connectivity options.

To get started in minutes, create a Zero Trust account, download the WARP agent, enroll these devices into your Zero Trust organization, and start creating Zero Trust policies to establish fast, secure connectivity between these devices. That’s it.

A comprehensive guide for the fastest possible Docker builds in human existence

2023-01-09T03:03:40.000Z

Extreme questions to trigger new, better ideas

2023-01-09T02:23:39.000Z

Setting Up a CI System Part 5: Time-sharing your test machines

2023-01-08T03:51:42.000Z

Lessons learned from three container-management systems over a decade

2023-01-07T18:11:41.000Z

EnterpriseReady

2023-01-07T17:22:12.000Z

Ask HN: Main things to consider when building an app for business/enterprise

2023-01-07T13:10:13.000Z

Podman in Action

2023-01-07T05:31:29.000Z

Show HN: Devbase – Find products to make your next fantastic project

2023-01-07T04:24:57.000Z

Artsy Engineering Handbook

2023-01-06T23:52:13.000Z

Ask HN: Why are URNs not more popular?

2023-01-06T15:44:07.000Z

Show HN: ClickHouse-local – a small tool for serverless data analytics

2023-01-05T19:30:48.000Z

Yark: Advanced and easy YouTube archiver now stable

2023-01-05T18:45:19.000Z

The YC Founder Directory

2023-01-05T16:00:52.000Z

Ask HN: As a startup, what are your must-have sections for a landing page?

2023-01-05T03:38:52.000Z

Rotate any secrets stored in CircleCI

2023-01-05T02:57:04.000Z

CircleCI security alert: Rotate any secrets stored in CircleCI

2023-01-05T02:57:04.000Z

Faster MySQL with HTTP/3

2023-01-04T16:40:26.000Z

Migrating from AWS to Fly.io

2023-01-03T21:13:45.000Z

You Should Watch This ‘90s Movie That’s Basically Just Cyberpunk 2077

2023-01-03T20:55:00.000Z

In the bright afternoon hours of New Year’s Day 2023, I squint through hungover eyes at my phone screen. My Twitter feed is blowing up about some movie I’ve never heard of before: Strange Days, a ‘90s Kathryn Bigelow sci-fi flick starring Ralph Fiennes, Angela Bassett, and Juliette Lewis.

You Want Modules, Not Microservices

2023-01-03T12:35:03.000Z

Ask HN: A Better Docker Compose?

2023-01-03T00:15:48.000Z

Interview with Benjamin de Cock, early designer at Stripe

2023-01-01T23:13:11.000Z

Show HN: Lama2 - Plain-Text Powered REST API Client for Teams

2023-01-01T14:06:56.000Z

Worker Tools: Tools for Writing HTTP Servers in Worker Runtimes

2022-12-31T21:53:51.000Z

Product-led growth for dev-first business: Is it inevitable?

2022-12-30T22:39:31.000Z

Phoenix 1.7 is View-less

2022-12-30T18:52:32.000Z

Running fast and slow: experiments with BPF programs' performance

2022-12-30T16:23:07.000Z

PRQL a simple, powerful, pipelined SQL replacement

2022-12-30T03:04:44.000Z

Golang is evil on shitty networks

2022-12-29T23:17:43.000Z

We Chose NanoIDs for PlanetScale’s API

2022-12-29T14:38:27.000Z

Ask HN: What's a build vs. buy decision that you got wrong?

2022-12-28T17:49:06.000Z

Acquia Cloud Next, a journey in platform modernization

2022-12-26T21:33:52.000Z

Ask HN: What are you predictions for 2023?

2022-12-25T09:25:01.000Z

AWS releases Finch: An open source client for container development

2022-12-24T08:36:53.000Z

Let’s Get Geralt The Netflix Series Armor In The New Witcher 3 Quest

2022-12-21T20:25:56.000Z

Henry Cavill may be out as Geralt in the Netflix Witcher series, but we will always have Doug Cockle as Geralt in The Witcher 3: Wild Hunt. And now, thanks to the latest next-gen update to the Witcher 3, we can have the best of both worlds. The new update brings “In The Eternal Fire’s Shadow,” a Witcher-worthy side…

What's New in Bazel 6.0

2022-12-19T17:44:26.000Z

Devpod: Remote development environment at Uber

2022-12-18T22:44:26.000Z

Reinventing backend subsetting at Google

2022-12-18T07:45:04.000Z

DBT Cloud increase Team plan price by 100% and limit features at the same time

2022-12-15T17:13:48.000Z

Tailnet Lock

2022-12-14T17:07:07.000Z

A Faster Horse: Infrastructure was the cloud’s first act. What’s its second?

2022-12-09T21:06:01.000Z

Edge-compatible Serverless Driver for Postgres

2022-12-08T16:18:57.000Z

Show HN: Ezy – open-source gRPC client, alternative to Postman and Insomnia

2022-12-08T10:50:05.000Z

CircleCI Layoffs

2022-12-07T21:23:33.000Z

macOS Command Line

2022-12-07T16:39:36.000Z

Configuring Your Outbound Webhook Requests with Static IPs

2022-12-07T15:14:02.000Z

New Docker Desktop: Run WASM Applications Alongside Linux Containers in Docker

2022-12-07T08:40:00.000Z

Framer's Magic Motion

2022-12-06T01:19:50.000Z

VCs using scouts means founders get the short end of the stick

2022-12-05T21:00:38.000Z

Pre-Auth RCE with CodeQL in Under 20 Minutes

2022-12-03T06:44:34.000Z

Mona Sans and Hubot Sans

2022-12-02T17:59:56.000Z

Android platform signing key compromised

2022-12-01T22:35:30.000Z

Mozilla, Microsoft yank TrustCor's root certificate authority

2022-12-01T01:02:46.000Z

Switching to AWS Graviton slashed our infrastructure bill

2022-11-30T20:01:31.000Z

New General Purpose, Compute Optimized, and Memory-Optimized Amazon EC2 Instances with Higher Packet-Processing Performance

2022-11-29T03:57:07.000Z

Today I would like to tell you about the next generation of Intel-powered general purpose, compute-optimized, and memory-optimized instances. All three of these instance families are powered by 3rd generation Intel Xeon Scalable processors (Ice Lake) running at 3.5 GHz, and are designed to support your data-intensive workloads with up to 200 Gbps of network bandwidth, the highest EBS performance in EC2 (up to 80 Gbps of bandwidth and up to 350,000 IOPS), and the ability to handle up to twice as many packets per second (PPS) as earlier instances.

New General Purpose (M6in/M6idn) Instances
The original general purpose EC2 instance (m1.small) was launched in 2006 and was the one and only instance type for a little over a year, until we launched the m1.large and m1.xlarge in late 2007. After that, we added the m3 in 2012, m4 in 2015, and the first in a very long line of m5 instances starting in 2017. The family tree branched in 2018 with the addition of the m5d instances with local NVMe storage.

And that brings us to today, and to the new m6in and m6idn instances, both available in 9 sizes:

Name	vCPUs	Memory	Local Storage (m6idn only)	Network Bandwidth	EBS Bandwidth	EBS IOPS
m6in.large m6idn.large	2	8 GiB	118 GB	Up to 25 Gbps	Up to 20 Gbps	Up to 87,500
m6in.xlarge m6idn.xlarge	4	16 GiB	237 GB	Up to 30 Gbps	Up to 20 Gbps	Up to 87,500
m6in.2xlarge m6idn.2xlarge	8	32 GiB	474 GB	Up to 40 Gbps	Up to 20 Gbps	Up to 87,500
m6in.4xlarge m6idn.4xlarge	16	64 GiB	950 GB	Up to 50 Gbps	Up to 20 Gbps	Up to 87,500
m6in.8xlarge m6idn.8xlarge	32	128 GiB	1900 GB	50 Gbps	20 Gbps	87,500
m6in.12xlarge m6idn.12xlarge	48	192 GiB	2950 GB (2 x 1425)	75 Gbps	30 Gbps	131,250
m6in.16xlarge m6idn.16xlarge	64	256 GiB	3800 GB (2 x 1900)	100 Gbps	40 Gbps	175,000
m6in.24xlarge m6idn.24xlarge	96	384 GiB	5700 GB (4 x 1425)	150 Gbps	60 Gbps	262,500
m6in.32xlarge m6idn.32xlarge	128	512 GiB	7600 GB (4 x 1900)	200 Gbps	80 Gbps	350,000

The m6in and m6idn instances are available in the US East (Ohio, N. Virginia) and Europe (Ireland) regions in On-Demand and Spot form. Savings Plans and Reserved Instances are available.

New C6in Instances
Back in 2008 we launched the first in what would prove to be a very long line of Amazon Elastic Compute Cloud (Amazon EC2) instances designed to give you high compute performance and a higher ratio of CPU power to memory than the general purpose instances. Starting with those initial c1 instances, we went on to launch cluster computing instances in 2010 (cc1) and 2011 (cc2), and then (once we got our naming figured out), multiple generations of compute-optimized instances powered by Intel processors: c3 (2013), c4 (2015), and c5 (2016). As our customers put these instances to use in environments where networking performance was starting to become a limiting factor, we introduced c5n instances with 100 Gbps networking in 2018. We also broadened the c5 instance lineup by adding additional sizes (including bare metal), and instances with blazing-fast local NVMe storage.

Today I am happy to announce the latest in our lineup of Intel-powered compute-optimized instances, the c6in, available in 9 sizes:

Name	vCPUs	Memory	Network Bandwidth	EBS Bandwidth	EBS IOPS
c6in.large	2	4 GiB	Up to 25 Gbps	Up to 20 Gbps	Up to 87,500
c6in.xlarge	4	8 GiB	Up to 30 Gbps	Up to 20 Gbps	Up to 87,500
c6in.2xlarge	8	16 GiB	Up to 40 Gbps	Up to 20 Gbps	Up to 87,500
c6in.4xlarge	16	32 GiB	Up to 50 Gbps	Up to 20 Gbps	Up to 87,500
c6in.8xlarge	32	64 GiB	50 Gbps	20 Gbps	87,500
c6in.12xlarge	48	96 GiB	75 Gbps	30 Gbps	131,250
c6in.16xlarge	64	128 GiB	100 Gbps	40 Gbps	175,000
c6in.24xlarge	96	192 GiB	150 Gbps	60 Gbps	262,500
c6in.32xlarge	128	256 GiB	200 Gbps	80 Gbps	350,000

The c6in instances are available in the US East (Ohio, N. Virginia), US West (Oregon), and Europe (Ireland) Regions.

As I noted earlier, these instances are designed to be able to handle up to twice as many packets per second (PPS) as their predecessors. This allows them to deliver increased performance in situations where they need to handle a large number of small-ish network packets, which will accelerate many applications and use cases includes network virtual appliances (firewalls, virtual routers, load balancers, and appliances that detect and protect against DDoS attacks), telecommunications (Voice over IP (VoIP) and 5G communication), build servers, caches, in-memory databases, and gaming hosts. With more network bandwidth and PPS on tap, heavy-duty analytics applications that retrieve and store massive amounts of data and objects from Amazon Amazon Simple Storage Service (Amazon S3) or data lakes will benefit. For workloads that benefit from low latency local storage, the disk versions of the new instances offer twice as much instance storage versus previous generation.

New Memory-Optimized (R6in/R6idn) Instances
The first memory-optimized instance was the m2, launched in 2009 with the now-quaint Double Extra Large and Quadruple Extra Large names, and a higher ration of memory to CPU power than the earlier m1 instances. We had yet to learn our naming lesson and launched the High Memory Cluster Eight Extra Large (aka cr1.8xlarge) in 2013, before settling on the r prefix and launching r3 instances in 2013, followed by r4 instances in 2014, and r5 instances in 2018.

And again that brings us to today, and to the new r6in and r6idn instances, also available in 9 sizes:

Name	vCPUs	Memory	Local Storage (r6idn only)	Network Bandwidth	EBS Bandwidth	EBS IOPS
r6in.large r6idn.large	2	16 GiB	118 GB	Up to 25 Gbps	Up to 20 Gbps	Up to 87,500
r6in.xlarge r6idn.xlarge	4	32 GiB	237 GB	Up to 30 Gbps	Up to 20 Gbps	Up to 87,500
r6in.2xlarge r6idn.2xlarge	8	64 GiB	474 GB	Up to 40 Gbps	Up to 20 Gbps	Up to 87,500
r6in.4xlarge r6idn.4xlarge	16	128 GiB	950 GB	Up to 50 Gbps	Up to 20 Gbps	Up to 87,500
r6in.8xlarge r6idn.8xlarge	32	256 GiB	1900 GB	50 Gbps	20 Gbps	87,500
r6in.12xlarge r6idn.12xlarge	48	384 GiB	2950 GB (2 x 1425)	75 Gbps	30 Gbps	131,250
r6in.16xlarge r6idn.16xlarge	64	512 GiB	3800 GB (2 x 1900)	100 Gbps	40 Gbps	175,000
r6in.24xlarge r6idn.24xlarge	96	768 GiB	5700 GB (4 x 1425)	150 Gbps	60 Gbps	262,500
r6in.32xlarge r6idn.32xlarge	128	1024 GiB	7600 GB (4 x 1900)	200 Gbps	80 Gbps	350,000

The r6in and r6idn instances are available in the US East (Ohio, N. Virginia), US West (Oregon), and Europe (Ireland) regions in On-Demand and Spot form. Savings Plans and Reserved Instances are available.

Inside the Instances
As you can probably guess from these specs and from the blog post that I wrote to launch the c6in instances, all of these new instance types have a lot in common. I’ll do a rare cut-and-paste from that post in order to reiterate all of the other cool features that are available to you:

Ice Lake Processors – The 3rd generation Intel Xeon Scalable processors run at 3.5 GHz, and (according to Intel) offer a 1.46x average performance gain over the prior generation. All-core Intel Turbo Boost mode is enabled on all instance sizes up to and including the 12xlarge. On the larger sizes, you can control the C-states. Intel Total Memory Encryption (TME) is enabled, protecting instance memory with a single, transient 128-bit key generated at boot time within the processor.

NUMA – Short for Non-Uniform Memory Access, this important architectural feature gives you the power to optimize for workloads where the majority of requests for a particular block of memory come from one of the processors, and that block is “closer” (architecturally speaking) to one of the processors. You can control processor affinity (and take advantage of NUMA) on the 24xlarge and 32xlarge instances.

Networking – Elastic Network Adapter (ENA) is available on all sizes of m6in, m6idn, c6in, r6in, and r6idn instances, and Elastic Fabric Adapter (EFA) is available on the 32xlarge instances. In order to make use of these adapters, you will need to make sure that your AMI includes the latest NVMe and ENA drivers. You can also make use of Cluster Placement Groups.

io2 Block Express – You can use all types of EBS volumes with these instances, including the io2 Block Express volumes that we launched earlier this year. As Channy shared in his post (Amazon EBS io2 Block Express Volumes with Amazon EC2 R5b Instances Are Now Generally Available), these volumes can be as large as 64 TiB, and can deliver up to 256,000 IOPS. As you can see from the tables above, you can use a 24xlarge or 32xlarge instance to achieve this level of performance.

Choosing the Right Instance
Prior to today’s launch, you could choose a c5n, m5n, or r5n instance to get the highest network bandwidth on an EC2 instance, or an r5b instance to have access to the highest EBS IOPS performance and high EBS bandwidth. Now, customers who need high networking or EBS performance can choose from a full portfolio of instances with different memory to vCPU ratio and instance storage options available, by selecting one of c6in, m6in, m6idn, r6in, or r6idn instances.

The higher performance of the c6in instances will allow you to scale your network intensive workloads that need a low memory to vCPU, such as network virtual appliances, caching servers, and gaming hosts.

The higher performance of m6in instances will allow you to scale your network and/or EBS intensive workloads such as data analytics, and telco applications including 5G User Plane Functions (UPF). You have the option to use the m6idn instance for workloads that benefit from low-latency local storage, such as high-performance file systems, or distributed web-scale in-memory caches.

Similarly, the higher network and EBS performance of the r6in instances will allow you to scale your network-intensive SQL, NoSQL, and in-memory database workloads, with the option to use the r6idn when you need low-latency local storage.

— Jeff;

The Ultimate Guide to FFmpeg

2022-11-28T09:30:20.000Z

Top Announcements of AWS re:Invent 2022

2022-11-28T06:08:04.000Z

AWS VP and Chief Evangelist Jeff Barr, plus a select group of AWS Developer Advocate colleagues, have personally chosen their picks for some of the most impactful and exciting product and service launches to debut at AWS re:Invent 2022. From now through Dec. 1, we’ll update this page daily with links to their AWS News Blog posts (plus a few noteworthy preview posts) so you can dive deeper once the launches have been announced.

As always, there’s simply too much for the team to cover and even if a launch doesn’t make this list, that doesn’t mean it’s not noteworthy. Make sure to check out What’s New for a complete rundown of all the AWS re:Invent 2022 announcements.

Here are a few more resources to help you keep up with all the re:Invent news:

The Official AWS Podcast will have keynote recaps each day and more deep dive episodes in the coming weeks.
AWS OnAir is livestreaming directly from the show floor bringing you the latest news, announcements, launches and demos from AWS re:Invent.

(This post was updated: 8:31 p.m. PST, Nov. 28, 2022.)

Quick category links:

Analytics

New — Create and Share Operational Reports at Scale with Amazon QuickSight Paginated Reports
This feature allows customers to create and share highly formatted, personalized reports containing business-critical data to hundreds of thousands of end-users without any infrastructure setup or maintenance, up-front licensing, or long-term commitments.

New Amazon QuickSight API Capabilities to Accelerate Your BI Transformation
New QuickSight API capabilities allow programmatic creation and management of dashboards, analysis, and templates.

New AWS Glue 4.0 – New and Updated Engines, More Data Formats, and More
This version of Glue includes Python 3.10 and Apache Spark 3.3.0, plus native support for the Cloud Shuffle Service Plugin for Spark. It also includes Pandas support, and more.

Announcing AWS Glue for Ray (Preview)
Data engineers can use AWS Glue for Ray to process large datasets with Python and popular Python libraries.

New for Amazon Transcribe – Real-Time Analytics During Live Calls
Real-time call analytics provides APIs for developers to accurately transcribe live calls and at the same time identify customer experience issues and sentiment in real time.

Artificial Intelligence / Machine Learning

Classifying and Extracting Mortgage Loan Data with Amazon Textract
The new API was created in response to requests from major lenders in the industry to help them process applications faster and reduce errors, which improves the end-customer experience and lowers operating costs.

Amazon CodeWhisperer Adds Enterprise Administrative Controls, Simple Sign-up, and Support for New Languages (Preview)
Administrators can now easily integrate CodeWhisperer with their existing workforce identity solutions, provide access to users and groups, and configure organization-wide settings.

Business Applications

AWS Wickr – A Secure, End-to-End Encrypted Communication Service For Enterprises With Auditing And Regulatory Requirements
Unlike many enterprise communication tools, Wickr uses end-to-end encryption mechanisms to ensure your messages, files, voice, or video calls are solely accessible to their intended recipients.

Compute

New – ENA Express: Improved Network Latency and Per-Flow Performance on EC2
Jeff Barr shares how ENA Express gives you a lot more per-flow bandwidth with a lot less variability.

New General Purpose, Compute Optimized, and Memory-Optimized Amazon EC2 Instances with Higher Packet-Processing Performance
The new instance families are designed to support your data-intensive workloads with the highest EBS performance in EC2, and the ability to handle up to twice as many packets per second (PPS) as earlier instances.

New Amazon EC2 Instance Types In the Works – C7gn, R7iz, and Hpc7g
Jeff Barr provides a look at three upcoming and exciting new instance types.

New – Amazon ECS Service Connect Enables Easy Communication Between Microservices
This new capability simplifies building and operating resilient distributed applications. You can add a layer of resilience to your ECS service communication and get traffic insights with no changes to your application code.

Announcing the availability of Microsoft Office Amazon Machine Images (AMIs) on Amazon EC2 with AWS provided licenses
With this offering, customers have the flexibility to run Microsoft Office dependent applications on EC2.

Containers

New – AWS Marketplace for Containers Now Supports Direct Deployment to Amazon EKS Clusters
This new launch makes it easier for you to find third-party Kubernetes operation software from the Amazon EKS console and deploy it to your EKS clusters using the same commands used to deploy EKS add-ons.

Database

New – Amazon RDS Optimized Reads and Optimized Writes
These two new features will accelerate your Amazon RDS for MySQL workloads.

New – Fully Managed Blue/Green Deployments in Amazon Aurora and Amazon RDS
This new feature for Amazon Aurora with MySQL compatibility, Amazon RDS for MySQL, and Amazon RDS for MariaDB, enables you to make database updates safer, simpler, and faster.

Management Tools

New – AWS Config Rules Now Support Proactive Compliance
This release extends AWS Conﬁg rules to support proactive mode so that they can be run at any time before provisioning and save time spent to implement custom pre-deployment validations.

New for AWS Control Tower – Comprehensive Controls Management (Preview)
You can use the new capability to apply managed preventative, detective, and proactive controls to accounts and organizational units by service, control objective, or compliance framework.

Protect Sensitive Data with Amazon CloudWatch Logs
This new set of capabilities for Amazon CloudWatch Logs leverages pattern matching and machine learning (ML) to detect and protect sensitive log data in transit.

New – Amazon CloudWatch Cross-Account Observability
This new capability lets you search, analyze, and correlate cross-account telemetry data stored in CloudWatch such as metrics, logs, and traces.

Amazon CloudWatch Internet Monitor Provides End-to-End Visibility into Internet Performance for your Applications (Preview)
This new capability gives visibility into how an internet issue might impact the performance and availability of your applications. It allows you to reduce the time it takes to diagnose internet issues from days to minutes.

Migration & Transfer Services

New – A Fully Managed Schema Conversion in AWS Database Migration Service
AWS DMS Schema Conversion streamlines database migrations by making schema assessment and conversion available inside AWS DMS. You can now plan, assess, convert and migrate under one central DMS service.

AWS Application Migration Service Major Updates – New Migration Servers Grouping, Updated Launch, and Post-Launch Template
These three major updates will support your migration projects of any size.

Security, Identity & Compliance

Amazon Inspector Now Scans AWS Lambda Functions for Vulnerabilities
Until now, customers who wanted to analyze their mixed workloads (including EC2 instances, container images, and Lambda functions) against common vulnerabilities needed to use AWS and third-party tools.

Automated Data Discovery for Amazon Macie
This new capability allows you to gain visibility into where your sensitive data resides on Amazon Simple Storage Service (Amazon S3) at a fraction of the cost of running a full data inspection across all your S3 buckets.

AWS announces Amazon Verified Permissions (Preview)
This central fine-grained permissions management system simplifies changing and updating permission rules in a single place without needing to change the code.

Storage

New – Failover Controls for Amazon S3 Multi-Region Access Points
These controls let you shift S3 data access request traffic routed through an Amazon S3 Multi-Region Access Point to an alternate AWS Region within minutes to test and build highly available applications for business continuity.

New – Announcing Amazon EFS Elastic Throughput
This new throughput mode is designed to provide your applications with as much throughput as they need with pay-as-you-use pricing.

New for AWS Backup – Protect and Restore Your CloudFormation Stacks
You now have an automated solution to create and restore your applications with a simplified experience, eliminating the need to manage custom scripts.

New – Amazon Redshift Support in AWS Backup
AWS Backup allows you to define a central backup policy to manage data protection of your applications and can now also protect your Amazon Redshift clusters.

Announcing Automated in-AWS Failback for AWS Elastic Disaster Recovery
The new automated support provides a simplified and expedited experience to fail back Amazon Elastic Compute Cloud (Amazon EC2) instances to the original Region, and both failover and failback processes (for on-premises or in-AWS recovery) can be conveniently started from the AWS Management Console.

Startup Restructuring 101

2022-11-27T01:00:07.000Z

Act: Run your GitHub Actions locally

2022-11-26T07:21:31.000Z

Finch: An open-source client for container development

2022-11-25T19:16:46.000Z

Craft

2022-11-25T03:31:15.000Z

Per-seat pricing is off the table now

2022-11-23T17:15:32.000Z

Wasmer 3.0

2022-11-23T17:01:03.000Z

CVE-2022-41924 – tailscaled can be used to remotely execute code on Windows

2022-11-21T18:17:54.000Z

Kite is saying farewell, and is open-sourcing all of its code.

2022-11-20T20:57:33.000Z

Kite is saying farewell and open-sourcing its code

2022-11-20T20:57:33.000Z

Tailscale with Avery Pennarun

2022-11-20T19:23:02.000Z

Ask HN: What is example of good documentation in your opinion?

2022-11-20T16:50:34.000Z

Hacker News Parody Thread

2022-11-20T13:14:22.000Z

Use a custom domain to send emails with Gmail using Cloudflare email routing

2022-11-18T19:02:20.000Z

Blazing fast CI with MicroVMs

2022-11-18T16:09:33.000Z

Fast CI with MicroVMs

2022-11-18T16:09:33.000Z

Devenv.sh: Fast and reproducible developer environments using Nix

2022-11-18T14:52:34.000Z

Doubling down on local development with Workers: Miniflare meets workerd

2022-11-18T14:00:00.000Z

Local development gives you a fully-controllable and easy-to-debug testing environment. At the start of this year, we brought this experience to Workers developers by launching Miniflare 2.0: a local Cloudflare Workers simulator. Miniflare 2 came with features like step-through debugging support, detailed console.logs, pretty source-mapped error pages, live reload and a highly-configurable unit testing environment. Not only that, but we also incorporated Miniflare into Wrangler, our Workers CLI, to enable wrangler dev’s --local mode.

Today, we’re taking local development to the next level! In addition to introducing new support for migrating existing projects to your local development environment, we're making it easier to work with your remote data—locally! Most importantly, we're releasing a much more accurate Miniflare 3, powered by the recently open-sourced workerd runtime—the same runtime used by Cloudflare Workers!

Enabling local development with workerd

One of the superpowers of having a local development environment is that you can test changes without affecting users in production. A great local environment offers a level of fidelity on par with production.

The way we originally approached local development was with Miniflare 2, which reimplemented Workers runtime APIs in JavaScript. Unfortunately, there were subtle behavior mismatches between these re-implementations and the real Workers runtime. These types of issues are really difficult for developers to debug, as they don’t appear locally, and step-through debugging of deployed Workers isn’t possible yet. For example, the following Worker returns responses successfully in Miniflare 2, so we might assume it’s safe to publish:

let cachedResponsePromise;
export default {
  async fetch(request, env, ctx) {
    // Let's imagine this fetch takes a few seconds. To speed up our worker, we
    // decide to only fetch on the first request, and reuse the result later.
    // This works fine in Miniflare 2, so we must be good right?
    cachedResponsePromise ??= fetch("https://example.com");
    return (await cachedResponsePromise).clone();
  },
};

However, as soon as we send multiple requests to our deployed Worker, it fails with Error: Cannot perform I/O on behalf of a different request. The problem here is that response bodies created in one request’s handler cannot be accessed from a different request's handler. This limitation allows Cloudflare to improve overall Worker performance, but it was almost impossible for Miniflare 2 to detect these types of issues locally. In this particular case, the best solution is to cache using fetch itself.

Additionally, because the Workers runtime uses a very recent version of V8, it supports some JavaScript features that aren’t available in all versions of Node.js. This meant a few features implemented in Workers, like Array#findLast, weren’t always available in Miniflare 2.

With the Workers runtime now open-sourced, Miniflare 3 can leverage the same implementations that are deployed on Cloudflare’s network, giving bug-for-bug compatibility and practically eliminating behavior mismatches. 🎉

Miniflare 3’s new simplified architecture using worked

This radically simplifies our implementation too. We were able to remove over 50,000 lines of code from Miniflare 2. Of course, we still kept all the Miniflare special-sauce that makes development fun like live reload and detailed logging. 🙂

Local development with real data

We know that many developers choose to test their Workers remotely on the Cloudflare network as it gives them the ability to test against real data. Testing against fake data in staging and local environments is sometimes difficult, as it never quite matches the real thing.

With Miniflare 3, we’re blurring the lines between local and remote development, by bringing real data to your machine as an experimental opt-in feature. If enabled, Miniflare will read and write data to namespaces on the Cloudflare network, as your Worker would when deployed. This is only supported with Workers KV for now, but we’re exploring similar solutions for R2 and D1.

Miniflare’s system for accessing real KV data, reads and writes are cached locally for future accesses

A new default for Wrangler

With Miniflare 3 now effectively as accurate as the real Workers environment, and the ability to access real data locally, we’re revisiting the decision to make remote development the initial Wrangler experience. In a future update, wrangler dev --local will become the default. --local will no longer be required. Benchmarking suggests this will bring an approximate 10x reduction to startup and a massive 60x reduction to script reload times! Over the next few weeks, we’ll be focusing on further optimizing Wrangler’s performance to bring you the fastest Workers development experience yet!

`wrangler init --from-dash`

We want all developers to be able to take advantage of the improved local experience, so we’re making it easy to start a local Wrangler project from an existing Worker that’s been developed in the Cloudflare dashboard. With Node.js installed, run npx wrangler init --from-dash in your terminal to set up a new project with all your existing code and bindings such as KV namespaces configured. You can now seamlessly continue development of your application locally, taking advantage of all the developer experience improvements Wrangler and Miniflare provide. When you’re ready to deploy your worker, run npx wrangler publish.

Looking to the future

Over the next few months, the Workers team is planning to further improve the local development experience with a specific focus on automated testing. Already, we’ve released a preliminary API for programmatic end-to-end tests with wrangler dev, but we’re also investigating ways of bringing Miniflare 2’s Jest/Vitest environments to workerd. We’re also considering creating extensions for popular IDEs to make developing workers even easier. 👀

Miniflare 3.0 is now included in Wrangler! Try it out by running npx wrangler@latest dev --experimental-local. Let us know what you think in the #wrangler channel on the Cloudflare Developers Discord, and please open a GitHub issue if you hit any unexpected behavior.

Show HN: A visual guide to Box Breathing

2022-11-18T12:16:26.000Z

Tailscale Funnel

2022-11-18T00:57:27.000Z

Awesome Node-Based UIs

2022-11-17T14:51:01.000Z

The Cloudflare API now uses OpenAPI schemas

2022-11-16T14:00:00.000Z

Today, we are announcing the general availability of OpenAPI Schemas for the Cloudflare API. These are published via GitHub and will be updated regularly as Cloudflare adds and updates APIs. OpenAPI is the widely adopted standard for defining APIs in a machine-readable format. OpenAPI Schemas allow for the ability to plug our API into a wide breadth of tooling to accelerate development for ourselves and customers. Internally, it will make it easier for us to maintain and update our APIs. Before getting into those benefits, let’s start with the basics.

What is OpenAPI?

Much of the Internet is built upon APIs (Application Programming Interfaces) or provides them as services to clients all around the world. This allows computers to talk to each other in a standardized fashion. OpenAPI is a widely adopted standard for how to define APIs. This allows other machines to reliably parse those definitions and use them in interesting ways. Cloudflare’s own API Shield product uses OpenAPI schemas to provide schema validation to ensure only well-formed API requests are sent to your origin.

Cloudflare itself has an API that customers can use to interface with our security and performance products from other places on the Internet. How do we define our own APIs? In the past we used a standard called JSON Hyper-Schema. That had served us well, but as time went on we wanted to adopt more tooling that could both benefit ourselves internally and make our customer’s lives easier. The OpenAPI community has flourished over the past few years providing many capabilities as we will discuss that were unavailable while we used JSON Hyper-Schema. As of today we now use OpenAPI.

You can learn more about OpenAPI itself here. Having an open, well-understood standard for defining our APIs allows for shared tooling and infrastructure to be used that can read these standard definitions. Let’s take a look at a few examples.

Uses of Cloudflare’s OpenAPI schemas

Most customers won’t need to use the schemas themselves to see value. The first system leveraging OpenAPI schemas is our new API Docs that were announced today. Because we now have OpenAPI schemas, we leverage the open source tool Stoplight Elements to aid in generating this new doc site. This allowed us to retire our previously custom-built site that was hard to maintain. Additionally, many engineers at Cloudflare are familiar with OpenAPI, so we gain teams can write new schemas more quickly and are less likely to make mistakes by using a standard that teams understand when defining new APIs.

There are ways to leverage the schemas directly, however. The OpenAPI community has a huge number of tools that only require a set of schemas to be able to use. Two such examples are mocking APIs and library generation.

Mocking Cloudflare’s API

Say you have code that calls Cloudflare’s API and you want to be able to easily run unit tests locally or integration tests in your CI/CD pipeline. While you could just call Cloudflare’s API in each run, you may not want to for a few reasons. First, you may want to run tests frequently enough that managing the creation and tear down of resources becomes a pain. Also, in many of these tests you aren’t trying to validate logic in Cloudflare necessarily, but your own system’s behavior. In this case, mocking Cloudflare’s API would be ideal since you can gain confidence that you aren’t violating Cloudflare’s API contract, but without needing to worry about specifics of managing real resources. Additionally, mocking allows you to simulate different scenarios, like being rate limited or receiving 500 errors. This allows you to test your code for typically rare circumstances that can end up having a serious impact.

As an example, Spotlight Prism could be used to mock Cloudflare’s API for testing purposes. With a local copy of Cloudflare’s API Schemas you can run the following command to spin up a local mock server:

$ docker run --init --rm \
  -v /home/user/git/api-schemas/openapi.yaml:/tmp/openapi.yaml \
  -p 4010:4010 stoplight/prism:4 \
  mock -h 0.0.0.0 /tmp/openapi.yaml

Then you can send requests to the mock server in order to validate that your use of Cloudflare’s API doesn’t violate the API contract locally:

$ curl -sX PUT localhost:4010/zones/f00/activation_check \
  -Hx-auth-email:foo@bar.com -Hx-auth-key:foobarbaz | jq
{
  "success": true,
  "errors": [],
  "messages": [],
  "result": {
    "id": "023e105f4ecef8ad9ca31a8372d0c353"
  }
}

This means faster development and shorter test runs while still catching API contract issues early before they get merged or deployed.

Library generation

Cloudflare has libraries in many programming languages like Terraform and Go, but we don’t support every possible programming language. Fortunately, using a tool like openapi generator, you can feed in Cloudflare’s API schemas and generate a library in a wide range of languages to then use in your code to talk to Cloudflare’s API. For example, you could generate a Java library using the following commands:

git clone https://github.com/openapitools/openapi-generator
cd openapi-generator
mvn clean package
java -jar modules/openapi-generator-cli/target/openapi-generator-cli.jar generate \
   -i https://raw.githubusercontent.com/cloudflare/api-schemas/main/openapi.yaml \
   -g java \
   -o /var/tmp/java_api_client

And then start using that client in your Java code to talk to Cloudflare’s API.

How Cloudflare transitioned to OpenAPI

As mentioned earlier, we previously used JSON Hyper-Schema to define our APIs. We have roughly 600 endpoints that were already defined in the schemas. Here is a snippet of what one endpoint looks like in JSON Hyper-Schema:

{
      "title": "List Zones",
      "description": "List, search, sort, and filter your zones.",
      "rel": "collection",
      "href": "zones",
      "method": "GET",
      "schema": {
        "$ref": "definitions/zone.json#/definitions/collection_query"
      },
      "targetSchema": {
        "$ref": "#/definitions/response_collection"
      },
      "cfOwnership": "www",
      "cfPlanAvailability": {
        "free": true,
        "pro": true,
        "business": true,
        "enterprise": true
      },
      "cfPermissionsRequired": {
        "enum": [
          "#zone:read"
        ]
      }
    }

Let’s look at the same endpoint in OpenAPI:

/zones:
    get:
      description: List, search, sort, and filter your zones.
      operationId: zone-list-zones
      responses:
        4xx:
          content:
            application/json:
              schema:
                allOf:
                - $ref: '#/components/schemas/components-schemas-response_collection'
                - $ref: '#/components/schemas/api-response-common-failure'
          description: List Zones response failure
        "200":
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/components-schemas-response_collection'
          description: List Zones response
      security:
      - api_email: []
        api_key: []
      summary: List Zones
      tags:
      - Zone
      x-cfPermissionsRequired:
        enum:
        - '#zone:read'
      x-cfPlanAvailability:
        business: true
        enterprise: true
        free: true
        pro: true

You can see that the two look fairly similar and for the most part the same information is contained in each including method type, a description, and request and response definitions (although those are linked in $refs). The value of migrating from one to the other isn’t the change in how we define the schemas themselves, but in what we can do with these schemas. Numerous tools can parse the latter, the OpenAPI, while much fewer can parse the former, the JSON Hyper-Schema.

If this one API was all that made up the Cloudflare API, it would be easy to just convert the JSON Hyper-Schema into the OpenAPI Schema by hand and call it a day. Doing this 600 times, however, was going to be a huge undertaking. When considering that teams are constantly adding new endpoints, it would be impossible to keep up. It was also the case that our existing API docs used the existing JSON Hyper-Schema, so that meant that we would need to keep both schemas up to date during any transition period. There had to be a better way.

Auto conversion

Given both JSON Hyper-Schema and OpenAPI are standards, it reasons that it should be possible to take a file in one format and convert to the other, right? Luckily the answer is yes! We built a tool that took all existing JSON Hyper-Schema and output fully compliant OpenAPI schemas. This of course didn’t happen overnight, but because of existing OpenAPI tooling, we could iteratively improve the auto convertor and run OpenAPI validation tooling over the output schemas to see what issues the conversion tool still had.

After many iterations and improvements to the conversion tool, we finally had fully compliant OpenAPI Spec schemas being auto-generated from our existing JSON Hyper-Schema. While we were building this tool, teams kept adding and updating the existing schemas and our Product Content team was also updating text in the schemas to make our API docs easier to use. The benefit of this process is we didn’t have to slow any of that work down since anything that changed in the old schemas was automatically reflected in the new schemas!

Once the tool was ready, the remaining step was to decide when and how we would stop making updates to the JSON Hyper-Schemas and move all teams to the OpenAPI Schemas. The (now old) API docs were the biggest concern, given they only understood JSON Hyper-Schema. Thanks to the help of our Developer Experience and Product Content teams, we were able to launch the new API docs today and can officially cut over to OpenAPI today as well!

What’s next?

Now that we have fully moved over to OpenAPI, more opportunities become available. Internally, we will be investigating what tooling we can adopt in order to help reduce the effort of individual teams and speed up API development. One idea we are exploring is automatically creating openAPI schemas from code notations. Externally, we now have the foundational tools necessary to begin exploring how to auto generate and support more programming language libraries for customers to use. We are also excited to see what you may do with the schemas yourself, so if you do something cool or have ideas, don’t hesitate to share them with us!

How PlanetScale Boost serves SQL queries faster

2022-11-15T16:58:54.000Z

Bob CD

2022-11-14T09:38:25.000Z

Taking off with Nix at FlightAware

2022-11-14T01:09:36.000Z

Ask HN: What are your “scratch own itch” projects?

2022-11-13T13:07:21.000Z

Why does zsh start so slowly?

2022-11-13T05:28:30.000Z

AWS IAM Roles, a tale of unnecessary complexity

2022-11-11T20:34:16.000Z

GitHub is releasing two open-source fonts: Mona and Hubot Sans

2022-11-10T21:28:38.000Z

Podman Desktop: A Free OSS Alternative to Docker Desktop

2022-11-09T19:55:40.000Z

Affinity 2

2022-11-09T12:05:32.000Z

Let’s build a container runtime using only the chroot system call

2022-11-09T01:42:04.000Z

Blessed.rs – An unofficial guide to the Rust ecosystem

2022-11-07T14:25:42.000Z

Fast builds, secure builds. Choose two

2022-11-07T09:49:28.000Z

The Principles of Pricing

2022-11-07T07:54:09.000Z

Principles of Pricing

2022-11-07T07:54:09.000Z

Almost monospaced: the perfect fonts for writing

2022-11-06T23:31:28.000Z

eBPF – Adding functionality to OS at runtime to achieve performance and security

2022-11-06T06:55:56.000Z

Tales from the Kernel Parameter Side

2022-11-04T07:52:16.000Z

Blip: A tool for seeing your Internet latency

2022-11-03T00:52:43.000Z

Show HN: Tier.run – Terraform for Stripe

2022-11-02T00:40:29.000Z

Rewind: The Search Engine for Your Life

2022-11-01T14:34:08.000Z

Is My Package Reproducible Yet?

2022-11-01T04:25:57.000Z

Gorgeous Retro Soulslike RPG Looks Just Like Octopath Traveler

2022-10-31T20:00:59.000Z

Duel Corp. looks so pretty that you’d be forgiving for thinking that it’s an upcoming RPG from Square Enix. But it’s actually made by some indie developers who decided to make a Soulslike from retro pixel art. And the effect is incredible.

Show HN: FFmpeg Command Visualizer and Editor

2022-10-29T11:20:02.000Z

Protobuf-ES – Implementation of Protocol Buffers for TypeScript and JavaScript

2022-10-29T06:34:59.000Z

Making an SSH client the hard way

2022-10-27T17:13:44.000Z

VHS: Your CLI home video recorder

2022-10-27T14:27:09.000Z

VHS: CLI home video recorder

2022-10-27T14:27:09.000Z

IOx: InfluxData’s New Storage Engine

2022-10-26T14:51:58.000Z

Cloudflare CDN Partial Outage

2022-10-25T17:07:36.000Z

Curious Case of Maintaining Sufficient Free Space with ZFS

2022-10-23T20:59:16.000Z

Maintaining sufficient free space with ZFS

2022-10-23T20:59:16.000Z

Introducing gRPC observability for microservices

2022-10-21T01:00:00.000Z

gRPC is a modern open source high performance Remote Procedure Call (RPC) framework that can run in any environment. It plays a critical role in efficiently connecting microservices in and across data centers with pluggable support for load balancing, tracing, health checking, authentication and other cross-cutting features. It may also be applied in the last mile of distributed computing to connect devices, mobile applications and browsers to backend services hosted on the public cloud. This unique position in the software stack can provide a clear end-to-end view of the whole system. A new gRPC observability feature provides this clarity for workloads running on, and/or able to connect to, Google Cloud.

The kinds of observability data provided

gRPC observability provides three different types of data:

1. Logs for key RPC events, including:

When the client/server sends or receives the metadata of an RPC
When the client/server sends or receives the message payload of an RPC
When the client/server finishes an RPC with a final status (OK, or errors)

2. Metrics (or statistical data) for key RPC events, including:

How many bytes the client/server sent or received
How many RPCs the client/server started or completed
How long RPCs take to complete between the client and server (known as round trip latency)

3. Distributed traces for RPCs and their fanout RPCs across the system. For example, when serving an RPC from upstream, a server may need to create multiple RPCs to its own backends. The distributed trace helps the user understand the relationships between these RPCs, the latency for each of them, and key events happening throughout the system.

How the observability data is produced and collected

When developers enable the gRPC observability feature in their binaries, the gRPC library will report the logging, metrics, and tracing data to Google Cloud’s operations suite. Once the observability data is collected, users can leverage the Google Cloud console to:

Visualize the observability data
Export the observability data out of the operations tools for further analysis with other tools.

Logging

gRPC observability provides logs for key RPC events with information to help developers understand the context when these events occur. This contextual information can include which gRPC service/method is being invoked, whether the events happen on the client side or server side, whether it’s sending metadata or payloads, the size of the corresponding data, and even the concrete content of the metadata and/or payloads. These log entries are then presented in Cloud Logging with helpers to filter and even customize the query to search related logs.

Metrics

gRPC observability provides several metrics: the round trip latency of RPCs, how many RPCs were started and finished during a specific period of time, and even the number of bytes sent/received over the wire. All these metrics can be grouped by a few important parameters, including service/method name and final status. Platform-specific metrics can be included as well, depending on the Google Cloud environment and the gRPC payload actually running. For example, on the Google Kubernetes Engine (GKE) platform, developers can group/filter by namespace, container, and pod information fields to dig into more granular statistical data. With these metrics, Cloud Monitoring enables users to identify problems including:

Which container is having higher than normal latency
Which pod is having higher than normal error rates
And others.

Tracing

gRPC observability also allows developers to configure the sampling rate of RPCs. The sampling decision is propagated across the whole system, thus no matter where the RPCs actually happen, developers can always see a complete, end-to-end distributed trace for their processing logic. Sampled RPCs and any further RPCs triggered by them are displayed in Cloud Trace as parent/children spans.

Getting started

With gRPC observability, telemetry data (logs, metrics, traces) of gRPC workloads can be collected and reported to the Google Cloud operations suite. It helps developers get a better understanding of their systems and enables them to diagnose problems such as:

Which microservices have suddenly become abnormally slow (long processing latency on the server side)?
Which microservices suddenly process less QPS, and is there a pattern?
Whether there’s a potential network issue for a particular microservice, as high latency is measured on the client side, but normal latency on the server side? If so, can we locate the problem in a particular cluster, or even a particular node/pod?

To get started with gRPC observability, see our user guide.

Identity management for WireGuard networks

2022-10-19T12:43:56.000Z

Identity management for WireGuard

2022-10-19T12:43:56.000Z

Fine-grained personal access tokens for GitHub

2022-10-18T15:36:45.000Z

Show HN: Filesystem Watcher

2022-10-18T13:54:09.000Z

Jetstack Paranoia: A New Open-Source Tool for Container Image Security

2022-10-18T10:58:03.000Z

TOTP for 2FA is incredibly easy to implement. So what's your excuse?

2022-10-18T10:02:43.000Z

On Bypassing eBPF Security Monitoring

2022-10-17T16:00:35.000Z

How Fly.io and Tailscale Saved Notado

2022-10-17T15:37:10.000Z

Being on a Board as an Employee

2022-10-17T15:28:20.000Z

Ask HN: How to Learn to Sell?

2022-10-16T15:19:19.000Z

A possibly new way of drawing boxes in the terminal

2022-10-15T17:15:01.000Z

Everything I wish I had known about raising a seed round

2022-10-11T04:58:22.000Z

Show HN: We built a tool that automatically generates your API Tests

2022-10-10T13:57:50.000Z

demo-gdb-step-thru-bpf.pages.dev/

Assembly within! BPF tail calls on x86 and ARM

2022-10-10T13:00:00.000Z

Early on when we learn to program, we get introduced to the concept of recursion. And that it is handy for computing, among other things, sequences defined in terms of recurrences. Such as the famous Fibonnaci numbers - Fn = Fn-1 + Fn-2.

Later on, perhaps when diving into multithreaded programming, we come to terms with the fact that the stack space for call frames is finite. And that there is an “okay” way and a “cool” way to calculate the Fibonacci numbers using recursion:

// fib_okay.c

#include 

uint64_t fib(uint64_t n)
{
        if (n == 0 || n == 1)
                return 1;

        return fib(n - 1) + fib(n - 2);
}

Listing 1. An okay Fibonacci number generator implementation

// fib_cool.c

#include 

static uint64_t fib_tail(uint64_t n, uint64_t a, uint64_t b)
{
    if (n == 0)
        return a;
    if (n == 1)
        return b;

    return fib_tail(n - 1, b, a + b);
}

uint64_t fib(uint64_t n)
{
    return fib_tail(n, 1, 1);
}

Listing 2. A better version of the same

If we take a look at the machine code the compiler produces, the “cool” variant translates to a nice and tight sequence of instructions:

⚠ DISCLAIMER: This blog post is assembly-heavy. We will be looking at assembly code for x86-64, arm64 and BPF architectures. If you need an introduction or a refresher, I can recommend “Low-Level Programming” by Igor Zhirkov for x86-64, and “Programming with 64-Bit ARM Assembly Language” by Stephen Smith for arm64. For BPF, see the Linux kernel documentation.

Listing 3. fib_cool.c compiled for x86-64 and arm64

The “okay” variant, disappointingly, leads to more instructions than a listing can fit. It is a spaghetti of basic blocks.

But more importantly, it is not free of x86 call instructions.

$ objdump -d fib_okay.o | grep call
 10c:   e8 00 00 00 00          call   111 
$ objdump -d fib_cool.o | grep call
$

This has an important consequence - as fib recursively calls itself, the stacks keep growing. We can observe it with a bit of help from the debugger.

$ gdb --quiet --batch --command=trace_rsp.gdb --args ./fib_okay 6
Breakpoint 1 at 0x401188: file fib_okay.c, line 3.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
n = 6, %rsp = 0xffffd920
n = 5, %rsp = 0xffffd900
n = 4, %rsp = 0xffffd8e0
n = 3, %rsp = 0xffffd8c0
n = 2, %rsp = 0xffffd8a0
n = 1, %rsp = 0xffffd880
n = 1, %rsp = 0xffffd8c0
n = 2, %rsp = 0xffffd8e0
n = 1, %rsp = 0xffffd8c0
n = 3, %rsp = 0xffffd900
n = 2, %rsp = 0xffffd8e0
n = 1, %rsp = 0xffffd8c0
n = 1, %rsp = 0xffffd900
13
[Inferior 1 (process 50904) exited normally]
$

While the “cool” variant makes no use of the stack.

$ gdb --quiet --batch --command=trace_rsp.gdb --args ./fib_cool 6
Breakpoint 1 at 0x40118a: file fib_cool.c, line 13.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
n = 6, %rsp = 0xffffd938
13
[Inferior 1 (process 50949) exited normally]
$

Where did the `calls` go?

The smart compiler turned the last function call in the body into a regular jump. Why was it allowed to do that?

It is the last instruction in the function body we are talking about. The caller stack frame is going to be destroyed right after we return anyway. So why keep it around when we can reuse it for the callee’s stack frame?

This optimization, known as tail call elimination, leaves us with no function calls in the “cool” variant of our fib implementation. There was only one call to eliminate - right at the end.

Once applied, the call becomes a jump (loop). If assembly is not your second language, decompiling the fib_cool.o object file with Ghidra helps see the transformation:

long fib(ulong param_1)

{
  long lVar1;
  long lVar2;
  long lVar3;
  
  if (param_1 < 2) {
    lVar3 = 1;
  }
  else {
    lVar3 = 1;
    lVar2 = 1;
    do {
      lVar1 = lVar3;
      param_1 = param_1 - 1;
      lVar3 = lVar2 + lVar1;
      lVar2 = lVar1;
    } while (param_1 != 1);
  }
  return lVar3;
}

Listing 4. fib_cool.o decompiled by Ghidra

This is very much desired. Not only is the generated machine code much shorter. It is also way faster due to lack of calls, which pop up on the profile for fib_okay.

But I am no performance ninja and this blog post is not about compiler optimizations. So why am I telling you about it?

Alex Dunkel (Maky), CC BY-SA 3.0, via Wikimedia Commons

Tail calls in BPF

The concept of tail call elimination made its way into the BPF world. Although not in the way you might expect. Yes, the LLVM compiler does get rid of the trailing function calls when building for -target bpf. The transformation happens at the intermediate representation level, so it is backend agnostic. This can save you some BPF-to-BPF function calls, which you can spot by looking for call -N instructions in the BPF assembly.

However, when we talk about tail calls in the BPF context, we usually have something else in mind. And that is a mechanism, built into the BPF JIT compiler, for chaining BPF programs.

We first adopted BPF tail calls when building our XDP-based packet processing pipeline. Thanks to it, we were able to divide the processing logic into several XDP programs. Each responsible for doing one thing.

Slide from “XDP based DDoS Mitigation” talk by Arthur Fabre

BPF tail calls have served us well since then. But they do have their caveats. Until recently it was impossible to have both BPF tails calls and BPF-to-BPF function calls in the same XDP program on arm64, which is one of the supported architectures for us.

Why? Before we get to that, we have to clarify what a BPF tail call actually does.

A tail call is a tail call is a tail call

BPF exposes the tail call mechanism through the bpf_tail_call helper, which we can invoke from our BPF code. We don’t directly point out which BPF program we would like to call. Instead, we pass it a BPF map (a container) capable of holding references to BPF programs (BPF_MAP_TYPE_PROG_ARRAY), and an index into the map.

long bpf_tail_call(void *ctx, struct bpf_map *prog_array_map, u32 index)

       Description
              This  special  helper is used to trigger a "tail call", or
              in other words, to jump into  another  eBPF  program.  The
              same  stack frame is used (but values on stack and in reg‐
              isters for the caller are not accessible to  the  callee).
              This  mechanism  allows  for  program chaining, either for
              raising the maximum number of available eBPF instructions,
              or  to  execute  given programs in conditional blocks. For
              security reasons, there is an upper limit to the number of
              successive tail calls that can be performed.

bpf-helpers(7) man page

At first glance, this looks somewhat similar to the execve(2) syscall. It is easy to mistake it for a way to execute a new program from the current program context. To quote the excellent BPF and XDP Reference Guide from the Cilium project documentation:

Tail calls can be seen as a mechanism that allows one BPF program to call another, without returning to the old program. Such a call has minimal overhead as unlike function calls, it is implemented as a long jump, reusing the same stack frame.

But once we add BPF function calls into the mix, it becomes clear that the BPF tail call mechanism is indeed an implementation of tail call elimination, rather than a way to replace one program with another:

Tail calls, before the actual jump to the target program, will unwind only its current stack frame. As we can see in the example above, if a tail call occurs from within the sub-function, the function’s (func1) stack frame will be present on the stack when a program execution is at func2. Once the final function (func3) function terminates, all the previous stack frames will be unwinded and control will get back to the caller of BPF program caller.

Alas, one with sometimes slightly surprising semantics. Consider the code like below, where a BPF function calls the bpf_tail_call() helper:

struct {
    __uint(type, BPF_MAP_TYPE_PROG_ARRAY);
    __uint(max_entries, 1);
    __uint(key_size, sizeof(__u32));
    __uint(value_size, sizeof(__u32));
} bar SEC(".maps");

SEC("tc")
int serve_drink(struct __sk_buff *skb __unused)
{
    return 0xcafe;
}

static __noinline
int bring_order(struct __sk_buff *skb)
{
    bpf_tail_call(skb, &bar, 0);
    return 0xf00d;
}

SEC("tc")
int server1(struct __sk_buff *skb)
{
    return bring_order(skb);    
}

SEC("tc")
int server2(struct __sk_buff *skb)
{
    __attribute__((musttail)) return bring_order(skb);  
}

We have two seemingly not so different BPF programs - server1() and server2(). They both call the same BPF function bring_order(). The function tail calls into the serve_drink() program, if the bar[0] map entry points to it (let’s assume that).

Do both server1 and server2 return the same value? Turns out that - no, they don’t. We get a hex 🍔 from server1, and a ☕ from server2. How so?

First thing to notice is that a BPF tail call unwinds just the current function stack frame. Code past the bpf_tail_call() invocation in the function body never executes, providing the tail call is successful (the map entry was set, and the tail call limit has not been reached).

When the tail call finishes, control returns to the caller of the function which made the tail call. Applying this to our example, the control flow is serverX() --> bring_order() --> bpf_tail_call() --> serve_drink() -return-> serverX() for both programs.

The second thing to keep in mind is that the compiler does not know that the bpf_tail_call() helper changes the control flow. Hence, the unsuspecting compiler optimizes the code as if the execution would continue past the BPF tail call.

The call graph for server1() and server2() is the same, but the return value differs due to build time optimizations.

In our case, the compiler thinks it is okay to propagate the constant which bring_order() returns to server1(). Possibly catching us by surprise, if we didn’t check the generated BPF assembly.

We can prevent it by forcing the compiler to make a tail call to bring_order(). This way we ensure that whatever bring_order() returns will be used as the server2() program result.

🛈 General rule - for least surprising results, use musttail attribute when calling a function that contain a BPF tail call.

How does the bpf_tail_call() work underneath then? And why the BPF verifier wouldn’t let us mix the function calls with tail calls on arm64? Time to dig deeper.

Public Domain image

BPF tail call on x86-64

What does a bpf_tail_call() helper call translate to after BPF JIT for x86-64 has compiled it? How does the implementation guarantee that we don’t end up in a tail call loop forever?

To find out we will need to piece together a few things.

First, there is the BPF JIT compiler source code, which lives in arch/x86/net/bpf_jit_comp.c. Its code is annotated with helpful comments. We will focus our attention on the following call chain within the JIT:

do_jit() 🔗
emit_prologue() 🔗
push_callee_regs() 🔗
for (i = 1; i <= insn_cnt; i++, insn++) {
switch (insn->code) {
case BPF_JMP | BPF_CALL:
/* emit function call */ 🔗
case BPF_JMP | BPF_TAIL_CALL:
emit_bpf_tail_call_direct() 🔗
case BPF_JMP | BPF_EXIT:
/* emit epilogue */ 🔗
}
}

It is sometimes hard to visualize the generated instruction stream just from reading the compiler code. Hence, we will also want to inspect the input - BPF instructions - and the output - x86-64 instructions - of the JIT compiler.

To inspect BPF and x86-64 instructions of a loaded BPF program, we can use bpftool prog dump. However, first we must populate the BPF map used as the tail call jump table. Otherwise, we might not be able to see the tail call jump!

This is due to optimizations that use instruction patching when the index into the program array is known at load time.

# bpftool prog loadall ./tail_call_ex1.o /sys/fs/bpf pinmaps /sys/fs/bpf
# bpftool map update pinned /sys/fs/bpf/jmp_table key 0 0 0 0 value pinned /sys/fs/bpf/target_prog
# bpftool prog dump xlated pinned /sys/fs/bpf/entry_prog
int entry_prog(struct __sk_buff * skb):
; bpf_tail_call(skb, &jmp_table, 0);
   0: (18) r2 = map[id:24]
   2: (b7) r3 = 0
   3: (85) call bpf_tail_call#12
; return 0xf00d;
   4: (b7) r0 = 61453
   5: (95) exit
# bpftool prog dump jited pinned /sys/fs/bpf/entry_prog
int entry_prog(struct __sk_buff * skb):
bpf_prog_4f697d723aa87765_entry_prog:
; bpf_tail_call(skb, &jmp_table, 0);
   0:   nopl   0x0(%rax,%rax,1)
   5:   xor    %eax,%eax
   7:   push   %rbp
   8:   mov    %rsp,%rbp
   b:   push   %rax
   c:   movabs $0xffff888102764800,%rsi
  16:   xor    %edx,%edx
  18:   mov    -0x4(%rbp),%eax
  1e:   cmp    $0x21,%eax
  21:   jae    0x0000000000000037
  23:   add    $0x1,%eax
  26:   mov    %eax,-0x4(%rbp)
  2c:   nopl   0x0(%rax,%rax,1)
  31:   pop    %rax
  32:   jmp    0xffffffffffffffe3   // bug? 🤔
; return 0xf00d;
  37:   mov    $0xf00d,%eax
  3c:   leave
  3d:   ret

There is a caveat. The target addresses for tail call jumps in bpftool prog dump jited output will not make any sense. To discover the real jump targets, we have to peek into the kernel memory. That can be done with gdb after we find the address of our JIT’ed BPF programs in /proc/kallsyms:

# tail -2 /proc/kallsyms
ffffffffa0000720 t bpf_prog_f85b2547b00cbbe9_target_prog        [bpf]
ffffffffa0000748 t bpf_prog_4f697d723aa87765_entry_prog [bpf]
# gdb -q -c /proc/kcore -ex 'x/18i 0xffffffffa0000748' -ex 'quit'
[New process 1]
Core was generated by `earlyprintk=serial,ttyS0,115200 console=ttyS0 psmouse.proto=exps "virtme_stty_c'.
#0  0x0000000000000000 in ?? ()
   0xffffffffa0000748:  nopl   0x0(%rax,%rax,1)
   0xffffffffa000074d:  xor    %eax,%eax
   0xffffffffa000074f:  push   %rbp
   0xffffffffa0000750:  mov    %rsp,%rbp
   0xffffffffa0000753:  push   %rax
   0xffffffffa0000754:  movabs $0xffff888102764800,%rsi
   0xffffffffa000075e:  xor    %edx,%edx
   0xffffffffa0000760:  mov    -0x4(%rbp),%eax
   0xffffffffa0000766:  cmp    $0x21,%eax
   0xffffffffa0000769:  jae    0xffffffffa000077f
   0xffffffffa000076b:  add    $0x1,%eax
   0xffffffffa000076e:  mov    %eax,-0x4(%rbp)
   0xffffffffa0000774:  nopl   0x0(%rax,%rax,1)
   0xffffffffa0000779:  pop    %rax
   0xffffffffa000077a:  jmp    0xffffffffa000072b
   0xffffffffa000077f:  mov    $0xf00d,%eax
   0xffffffffa0000784:  leave
   0xffffffffa0000785:  ret
# gdb -q -c /proc/kcore -ex 'x/7i 0xffffffffa0000720' -ex 'quit'
[New process 1]
Core was generated by `earlyprintk=serial,ttyS0,115200 console=ttyS0 psmouse.proto=exps "virtme_stty_c'.
#0  0x0000000000000000 in ?? ()
   0xffffffffa0000720:  nopl   0x0(%rax,%rax,1)
   0xffffffffa0000725:  xchg   %ax,%ax
   0xffffffffa0000727:  push   %rbp
   0xffffffffa0000728:  mov    %rsp,%rbp
   0xffffffffa000072b:  mov    $0xcafe,%eax
   0xffffffffa0000730:  leave
   0xffffffffa0000731:  ret
#

Lastly, it will be handy to have a cheat sheet of mapping between BPF registers (r0, r1, …) to hardware registers (rax, rdi, …) that the JIT compiler uses.

BPF	x86-64
r0	rax
r1	rdi
r2	rsi
r3	rdx
r4	rcx
r5	r8
r6	rbx
r7	r13
r8	r14
r9	r15
r10	rbp
internal	r9-r12

Now we are prepared to work out what happens when we use a BPF tail call.

In essence, bpf_tail_call() emits a jump into another function, reusing the current stack frame. It is just like a regular optimized tail call, but with a twist.

Because of the BPF security guarantees - execution terminates, no stack overflows - there is a limit on the number of tail calls we can have (MAX_TAIL_CALL_CNT = 33).

Counting the tail calls across BPF programs is not something we can do at load-time. The jump table (BPF program array) contents can change after the program has been verified. Our only option is to keep track of tail calls at run-time. That is why the JIT’ed code for the bpf_tail_call() helper checks and updates the tail_call_cnt counter.

The updated count is then passed from one BPF program to another, and from one BPF function to another, as we will see, through the rax register (r0 in BPF).

Luckily for us, the x86-64 calling convention dictates that the rax register does not partake in passing function arguments, but rather holds the function return value. The JIT can repurpose it to pass an additional - hidden - argument.

The function body is, however, free to make use of the r0/rax register in any way it pleases. This explains why we want to save the tail_call_cnt passed via rax onto stack right after we jump to another program. bpf_tail_call() can later load the value from a known location on the stack.

This way, the code emitted for each bpf_tail_call() invocation, and the BPF function prologue work in tandem, keeping track of tail call count across BPF program boundaries.

But what if our BPF program is split up into several BPF functions, each with its own stack frame? What if these functions perform BPF tail calls? How is the tail call count tracked then?

Mixing BPF function calls with BPF tail calls

BPF has its own terminology when it comes to functions and calling them, which is influenced by the internal implementation. Function calls are referred to as BPF to BPF calls. Also, the main/entry function in your BPF code is called “the program”, while all other functions are known as “subprograms”.

Each call to subprogram allocates a stack frame for local state, which persists until the function returns. Naturally, BPF subprogram calls can be nested creating a call chain. Just like nested function calls in user-space.

BPF subprograms are also allowed to make BPF tail calls. This, effectively, is a mechanism for extending the call chain to another BPF program and its subprograms.

If we cannot track how long the call chain can be, and how much stack space each function uses, we put ourselves at risk of overflowing the stack. We cannot let this happen, so BPF enforces limitations on when and how many BPF tail calls can be done:

static int check_max_stack_depth(struct bpf_verifier_env *env)
{
        …
        /* protect against potential stack overflow that might happen when
         * bpf2bpf calls get combined with tailcalls. Limit the caller's stack
         * depth for such case down to 256 so that the worst case scenario
         * would result in 8k stack size (32 which is tailcall limit * 256 =
         * 8k).
         *
         * To get the idea what might happen, see an example:
         * func1 -> sub rsp, 128
         *  subfunc1 -> sub rsp, 256
         *  tailcall1 -> add rsp, 256
         *   func2 -> sub rsp, 192 (total stack size = 128 + 192 = 320)
         *   subfunc2 -> sub rsp, 64
         *   subfunc22 -> sub rsp, 128
         *   tailcall2 -> add rsp, 128
         *    func3 -> sub rsp, 32 (total stack size 128 + 192 + 64 + 32 = 416)
         *
         * tailcall will unwind the current stack frame but it will not get rid
         * of caller's stack as shown on the example above.
         */
        if (idx && subprog[idx].has_tail_call && depth >= 256) {
                verbose(env,
                        "tail_calls are not allowed when call stack of previous frames is %d bytes. Too large\n",
                        depth);
                return -EACCES;
        }
        …
}

While the stack depth can be calculated by the BPF verifier at load-time, we still need to keep count of tail call jumps at run-time. Even when subprograms are involved.

This means that we have to pass the tail call count from one BPF subprogram to another, just like we did when making a BPF tail call, so we yet again turn to value passing through the rax register.

Control flow in a BPF program with a function call followed by a tail call.

🛈 To keep things simple, BPF code in our examples does not allocate anything on stack. I encourage you to check how the JIT’ed code changes when you add some local variables. Just make sure the compiler does not optimize them out.

To make it work, we need to:

① load the tail call count saved on stack into rax before call’ing the subprogram,
② adjust the subprogram prologue, so that it does not reset the rax like the main program does,
③ save the passed tail call count on subprogram’s stack for the bpf_tail_call() helper to consume it.

A bpf_tail_call() within our suprogram will then:

④ load the tail call count from stack,
⑤ unwind the BPF stack, but keep the current subprogram’s stack frame in tact, and
⑥ jump to the target BPF program.

Now we have seen how all the pieces of the puzzle fit together to make BPF tail work on x86-64 safely. The only open question is does it work the same way on other platforms like arm64? Time to shift gears and dive into a completely different BPF JIT implementation.

Based on an image by Wutthichai Charoenburi, CC BY 2.0

Tail calls on arm64

If you try loading a BPF program that uses both BPF function calls (aka BPF to BPF calls) and BPF tail calls on an arm64 machine running the latest 5.15 LTS kernel, or even the latest 5.19 stable kernel, the BPF verifier will kindly ask you to reconsider your choice:

# uname -rm
5.19.12 aarch64
# bpftool prog loadall tail_call_ex2.o /sys/fs/bpf
libbpf: prog 'entry_prog': BPF program load failed: Invalid argument
libbpf: prog 'entry_prog': -- BEGIN PROG LOAD LOG --
0: R1=ctx(off=0,imm=0) R10=fp0
; __attribute__((musttail)) return sub_func(skb);
0: (85) call pc+1
caller:
 R10=fp0
callee:
 frame1: R1=ctx(off=0,imm=0) R10=fp0
; bpf_tail_call(skb, &jmp_table, 0);
2: (18) r2 = 0xffffff80c38c7200       ; frame1: R2_w=map_ptr(off=0,ks=4,vs=4,imm=0)
4: (b7) r3 = 0                        ; frame1: R3_w=P0
5: (85) call bpf_tail_call#12
tail_calls are not allowed in non-JITed programs with bpf-to-bpf calls
processed 4 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
-- END PROG LOAD LOG --
…
#

That is a pity! We have been looking forward to reaping the benefits of code sharing with BPF to BPF calls in our lengthy machine generated BPF programs. So we asked - how hard could it be to make it work?

After all, BPF JIT for arm64 already can handle BPF tail calls and BPF to BPF calls, when used in isolation.

It is “just” a matter of understanding the existing JIT implementation, which lives in arch/arm64/net/bpf_jit_comp.c, and identifying the missing pieces.

To understand how BPF JIT for arm64 works, we will use the same method as before - look at its code together with sample input (BPF instructions) and output (arm64 instructions).

We don’t have to read the whole source code. It is enough to zero in on a few particular code paths:

bpf_int_jit_compile() 🔗
build_prologue() 🔗
build_body() 🔗
for (i = 0; i < prog->len; i++) {
build_insn() 🔗
switch (code) {
case BPF_JMP | BPF_CALL:
/* emit function call */ 🔗
case BPF_JMP | BPF_TAIL_CALL:
emit_bpf_tail_call() 🔗
}
}
build_epilogue() 🔗

One thing that the arm64 architecture, and RISC architectures in general, are known for is that it has a plethora of general purpose registers (x0-x30). This is a good thing. We have more registers to allocate to JIT internal state, like the tail call count. A cheat sheet of what roles the hardware registers play in the BPF JIT will be helpful:

BPF	arm64
r0	x7
r1	x0
r2	x1
r3	x2
r4	x3
r5	x4
r6	x19
r7	x20
r8	x21
r9	x22
r10	x25
internal	x9-x12, x26 (tail_call_cnt), x27

Now let’s try to understand the state of things by looking at the JIT’s input and output for two particular scenarios: (1) a BPF tail call, and (2) a BPF to BPF call.

It is hard to read assembly code selectively. We will have to go through all instructions one by one, and understand what each one is doing.

⚠ Brace yourself. Time to decipher a bit of ARM64 assembly. If this will be your first time reading ARM64 assembly, you might want to at least skim through this Guide to ARM64 / AArch64 Assembly on Linux before diving in.

Scenario #1: A single BPF tail call - tail_call_ex1.bpf.c

Input: BPF assembly (bpftool prog dump xlated)

   0: (18) r2 = map[id:4]           // jmp_table map
   2: (b7) r3 = 0
   3: (85) call bpf_tail_call#12
   4: (b7) r0 = 61453               // 0xf00d
   5: (95) exit

Output: ARM64 assembly (bpftool prog dump jited)

 0:   paciasp                            // Sign LR (ROP protection) ①
 4:   stp     x29, x30, [sp, #-16]!      // Save FP and LR registers ②
 8:   mov     x29, sp                    // Set up Frame Pointer
 c:   stp     x19, x20, [sp, #-16]!      // Save callee-saved registers ③
10:   stp     x21, x22, [sp, #-16]!      // ⋮ 
14:   stp     x25, x26, [sp, #-16]!      // ⋮ 
18:   stp     x27, x28, [sp, #-16]!      // ⋮ 
1c:   mov     x25, sp                    // Set up BPF stack base register (r10)
20:   mov     x26, #0x0                  // Initialize tail_call_cnt ④
24:   sub     x27, x25, #0x0             // Calculate FP bottom ⑤
28:   sub     sp, sp, #0x200             // Set up BPF program stack ⑥
2c:   mov     x1, #0xffffff80ffffffff    // r2 = map[id:4] ⑦
30:   movk    x1, #0xc38c, lsl #16       // ⋮ 
34:   movk    x1, #0x7200                // ⋮
38:   mov     x2, #0x0                   // r3 = 0
3c:   mov     w10, #0x24                 // = offsetof(struct bpf_array, map.max_entries) ⑧
40:   ldr     w10, [x1, x10]             // Load array->map.max_entries
44:   add     w2, w2, #0x0               // = index (0)
48:   cmp     w2, w10                    // if (index >= array->map.max_entries)
4c:   b.cs    0x0000000000000088         //     goto out;
50:   mov     w10, #0x21                 // = MAX_TAIL_CALL_CNT (33)
54:   cmp     x26, x10                   // if (tail_call_cnt >= MAX_TAIL_CALL_CNT)
58:   b.cs    0x0000000000000088         //     goto out;
5c:   add     x26, x26, #0x1             // tail_call_cnt++;
60:   mov     w10, #0x110                // = offsetof(struct bpf_array, ptrs)
64:   add     x10, x1, x10               // = &array->ptrs
68:   lsl     x11, x2, #3                // = index * sizeof(array->ptrs[0])
6c:   ldr     x11, [x10, x11]            // prog = array->ptrs[index];
70:   cbz     x11, 0x0000000000000088    // if (prog == NULL) goto out;
74:   mov     w10, #0x30                 // = offsetof(struct bpf_prog, bpf_func)
78:   ldr     x10, [x11, x10]            // Load prog->bpf_func
7c:   add     x10, x10, #0x24            // += PROLOGUE_OFFSET * AARCH64_INSN_SIZE (4)
80:   add     sp, sp, #0x200             // Unwind BPF stack
84:   br      x10                        // goto *(prog->bpf_func + prologue_offset)
88:   mov     x7, #0xf00d                // r0 = 0xf00d
8c:   add     sp, sp, #0x200             // Unwind BPF stack ⑨
90:   ldp     x27, x28, [sp], #16        // Restore used callee-saved registers
94:   ldp     x25, x26, [sp], #16        // ⋮
98:   ldp     x21, x22, [sp], #16        // ⋮
9c:   ldp     x19, x20, [sp], #16        // ⋮
a0:   ldp     x29, x30, [sp], #16        // ⋮
a4:   add     x0, x7, #0x0               // Set return value
a8:   autiasp                            // Authenticate LR
ac:   ret                                // Return to caller

① BPF program prologue starts with Pointer Authentication Code (PAC), which protects against Return Oriented Programming attacks. PAC instructions are emitted by JIT only if CONFIG_ARM64_PTR_AUTH_KERNEL is enabled.

② Arm 64 Architecture Procedure Call Standard mandates that the Frame Pointer (register X29) and the Link Register (register X30), aka the return address, of the caller should be recorded onto the stack.

③ Registers X19 to X28, and X29 (FP) plus X30 (LR), are callee saved. ARM64 BPF JIT does not use registers X23 and X24 currently, so they are not saved.

④ We track the tail call depth in X26. No need to save it onto stack since we use a register dedicated just for this purpose.

⑤ FP bottom is an optimization that allows store/loads to BPF stack with a single instruction and an immediate offset value.

⑥ Reserve space for the BPF program stack. The stack layout is now as shown in a diagram in build_prologue() source code.

⑦ The BPF function body starts here.

⑧ bpf_tail_call() instructions start here.

⑨ The epilogue starts here.

Whew! That was a handful 😅.

Notice that the BPF tail call implementation on arm64 is not as optimized as on x86-64. There is no code patching to make direct jumps when the target program index is known at the JIT-compilation time. Instead, the target address is always loaded from the BPF program array.

Ready for the second scenario? I promise it will be shorter. Function prologue and epilogue instructions will look familiar, so we are going to keep annotations down to a minimum.

Scenario #2: A BPF to BPF call - sub_call_ex1.bpf.c

Input: BPF assembly (bpftool prog dump xlated)

int entry_prog(struct __sk_buff * skb):
   0: (85) call pc+1#bpf_prog_a84919ecd878b8f3_sub_func
   1: (95) exit
int sub_func(struct __sk_buff * skb):
   2: (b7) r0 = 61453                   // 0xf00d
   3: (95) exit

Output: ARM64 assembly

int entry_prog(struct __sk_buff * skb):
bpf_prog_163e74e7188910f2_entry_prog:
   0:   paciasp                                 // Begin prologue
   4:   stp     x29, x30, [sp, #-16]!           // ⋮
   8:   mov     x29, sp                         // ⋮
   c:   stp     x19, x20, [sp, #-16]!           // ⋮
  10:   stp     x21, x22, [sp, #-16]!           // ⋮
  14:   stp     x25, x26, [sp, #-16]!           // ⋮
  18:   stp     x27, x28, [sp, #-16]!           // ⋮
  1c:   mov     x25, sp                         // ⋮
  20:   mov     x26, #0x0                       // ⋮
  24:   sub     x27, x25, #0x0                  // ⋮
  28:   sub     sp, sp, #0x0                    // End prologue
  2c:   mov     x10, #0xffffffffffff5420        // Build sub_func()+0x0 address
  30:   movk    x10, #0x8ff, lsl #16            // ⋮
  34:   movk    x10, #0xffc0, lsl #32           // ⋮
  38:   blr     x10 ------------------.         // Call sub_func()+0x0 
  3c:   add     x7, x0, #0x0 <----------.       // r0 = sub_func()
  40:   mov     sp, sp                | |       // Begin epilogue
  44:   ldp     x27, x28, [sp], #16   | |       // ⋮
  48:   ldp     x25, x26, [sp], #16   | |       // ⋮
  4c:   ldp     x21, x22, [sp], #16   | |       // ⋮
  50:   ldp     x19, x20, [sp], #16   | |       // ⋮
  54:   ldp     x29, x30, [sp], #16   | |       // ⋮
  58:   add     x0, x7, #0x0          | |       // ⋮
  5c:   autiasp                       | |       // ⋮
  60:   ret                           | |       // End epilogue
                                      | |
int sub_func(struct __sk_buff * skb): | |
bpf_prog_a84919ecd878b8f3_sub_func:   | |
   0:   paciasp <---------------------' |       // Begin prologue
   4:   stp     x29, x30, [sp, #-16]!   |       // ⋮
   8:   mov     x29, sp                 |       // ⋮
   c:   stp     x19, x20, [sp, #-16]!   |       // ⋮
  10:   stp     x21, x22, [sp, #-16]!   |       // ⋮
  14:   stp     x25, x26, [sp, #-16]!   |       // ⋮
  18:   stp     x27, x28, [sp, #-16]!   |       // ⋮
  1c:   mov     x25, sp                 |       // ⋮
  20:   mov     x26, #0x0               |       // ⋮
  24:   sub     x27, x25, #0x0          |       // ⋮
  28:   sub     sp, sp, #0x0            |       // End prologue
  2c:   mov     x7, #0xf00d             |       // r0 = 0xf00d
  30:   mov     sp, sp                  |       // Begin epilogue
  34:   ldp     x27, x28, [sp], #16     |       // ⋮
  38:   ldp     x25, x26, [sp], #16     |       // ⋮
  3c:   ldp     x21, x22, [sp], #16     |       // ⋮
  40:   ldp     x19, x20, [sp], #16     |       // ⋮
  44:   ldp     x29, x30, [sp], #16     |       // ⋮
  48:   add     x0, x7, #0x0            |       // ⋮
  4c:   autiasp                         |       // ⋮
  50:   ret ----------------------------'       // End epilogue

We have now seen what a BPF tail call and a BPF function/subprogram call compiles down to. Can you already spot what would go wrong if mixing the two was allowed?

That’s right! Every time we enter a BPF subprogram, we reset the X26 register, which holds the tail call count, to zero (mov x26, #0x0). This is bad. It would let users create program chains longer than the MAX_TAIL_CALL_CNT limit.

How about we just skip this step when emitting the prologue for BPF subprograms?

@@ -246,6 +246,7 @@ static bool is_lsi_offset(int offset, int scale)
 static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf)
 {
        const struct bpf_prog *prog = ctx->prog;
+       const bool is_main_prog = prog->aux->func_idx == 0;
        const u8 r6 = bpf2a64[BPF_REG_6];
        const u8 r7 = bpf2a64[BPF_REG_7];
        const u8 r8 = bpf2a64[BPF_REG_8];
@@ -299,7 +300,7 @@ static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf)
        /* Set up BPF prog stack base register */
        emit(A64_MOV(1, fp, A64_SP), ctx);

-       if (!ebpf_from_cbpf) {
+       if (!ebpf_from_cbpf && is_main_prog) {
                /* Initialize tail_call_cnt */
                emit(A64_MOVZ(1, tcc, 0, 0), ctx);

Believe it or not. This is everything that was missing to get BPF tail calls working with function calls on arm64. The feature will be enabled in the upcoming Linux 6.0 release.

Outro

From recursion to tweaking the BPF JIT. How did we get here? Not important. It’s all about the journey.

Along the way we have unveiled a few secrets behind BPF tails calls, and hopefully quenched your thirst for low-level programming. At least for today.

All that is left is to sit back and watch the fruits of our work. With GDB hooked up to a VM, we can observe how a BPF program calls into a BPF function, and from there tail calls to another BPF program:

Until next time 🖖.

Show HN: Reflame – Deploy your React web apps in milliseconds

2022-10-08T17:10:57.000Z

The Future of the Web Is on the Edge

2022-10-06T17:08:50.000Z

Using a Docker registry as a distributed layer cache for CI

2022-09-29T11:49:44.000Z

Show HN: Depot – fast, remote Docker container builds

2022-09-28T18:01:01.000Z

Modern Serverless Job Schedulers

2022-09-28T17:53:24.000Z

Leading venture capital firms to provide up to $1.25 BILLION to back startups built on Cloudflare Workers

2022-09-27T13:00:00.000Z

This post is also available in 简体中文, 日本語 and Deutsch.

From our earliest days, Cloudflare has stood for helping build a better Internet that’s accessible to all. It’s core to our mission that anyone who wants to start building on the Internet should be able to do so easily, and without the barriers of prohibitively expensive or difficult to use infrastructure.

Nowhere is this philosophy more important – and more impactful to the Internet – than with our developer platform, Cloudflare Workers. Workers is, quite simply, where developers and entrepreneurs start on Day 1. It’s a full developer platform that includes cloud storage; website hosting; SQL databases; and of course, the industry’s leading serverless product. The platform’s ease-of-use and accessible pricing (all the way down to free) are critical in advancing our mission. For startups, this translates into fast, easy deployment and iteration, that scales seamlessly with predictable, transparent and cost-effective pricing. Building a great business from scratch is hard enough – we ought to know! – and so we’re aiming to take all the complexity out of your application infrastructure.

Announcing the Workers Launchpad funding program

Today, we’re taking things a step further and making it easier for startups to build the business of their dreams. We’re announcing a $1.25 billion Workers Launchpad funding program in partnership with some of the world’s leading venture capital firms. Any startup built on Workers can apply. As is the case with the Workers Platform itself, we’ve tried to make applying dead simple: it should take you less than five minutes to submit your application through the Workers Launchpad portal.

How does it work? The only requirement for being eligible for the funding program is that you’ve built your core infrastructure on Workers. If you’re new to Cloudflare and Cloudflare Workers, check out our Startup Plan to get started. We hope these resources will be helpful to all startups and help level the playing field, no matter where in the world you might be.

Once you submit your application, it will be reviewed by our Launchpad team, several of whom are former entrepreneurs and venture capital folks themselves. They’ll match promising applicants with our VC partners who have the most expertise in your space (more on them below). Every quarter, we’ll announce the winners of our Launchpad program. Winners, our “Workers Founders”, will be guaranteed the opportunity to pitch the VC partner(s) that we’ve determined would be a good match for your business. It’s a win-win all around. VCs get the opportunity to invest in businesses they know are being built on a forward-looking, world-class, development platform. Entrepreneurs get connected to world-class VCs. And for the first class of winners, we’ll have a few added perks that we describe in more detail below.

Who are the VCs that you might get a chance to pitch to?

When we approached our friends in the venture community with our vision for the Workers Launchpad, we received incredibly positive feedback and excitement. Many have seen firsthand the competitive advantages of building on Workers through their own portfolio companies. Moreover, Cloudflare is home to one of the largest developer communities on Earth with approximately 20% of the world’s websites on our network. As such, we can play a unique role in matching great entrepreneurs with great VCs to further not only the Workers platform, but also the Internet ecosystem, for everyone.

We’re honored to announce a world-class group of VC Launch Partners supporting this program and the ecosystem of Workers-based startups:

More on why we’re doing this

So why are we doing this? The simple answer is we’re proud of our Workers Developer Platform and think that everyone should be using it. Entrepreneurs who develop on Workers can ship faster; more easily and cost-effectively; and in a way that future proofs your infrastructure:

Speed. Development velocity isn’t just a convenience for an entrepreneur. It’s a massive competitive advantage. In fact, development velocity is one of Cloudflare’s competitive advantages – we’re able to develop quickly because we build on Workers. When you develop on Workers, you don’t need to spend time configuring DNS records, maintaining certificates, scaling up clusters, or building complex deployment pipelines. Focus on developing your application, and Cloudflare will handle the rest.

Ease of use. Startup teams and founders are some of the busiest people on earth. You shouldn’t have to think about – or make complicated decisions about – IT infrastructure. Questions like: “Which availability zone should I choose?”, or “Will I be able to scale up my infrastructure in time for our next viral marketing campaign?” shouldn’t have to cross your mind! And on the Workers Platform, they don’t. The code you and your team writes automatically deploys quickly and consistently across Cloudflare’s global network in 275+ cities in over 100 countries. Cloudflare securely and scalably connects your users to your applications, regardless of where those applications are hosted or how many users suddenly sign up for your product. Developers can easily manage globally distributed applications with a programmable network that easily connects to whatever services they need to talk to.

Future-proofing your infrastructure and your wallet. Cloudflare’s massive global network – that’s distributed across 275+ cities in over 100 countries – is able to scale with your business, no matter how large it grows to become. We also help you remain compliant with local laws and regulations as you expand around the world, with capabilities like Workers’ Jurisdictional Restrictions for Durable Objects. You can sleep soundly at night instead of worrying about how to level up your infrastructure in the midst of shifting regulations, and equally importantly, knowing that you will not wake up to any surprise bills. Many of us have had the experience of being charged unexpected and / or exorbitant fees from our cloud providers. For example, providers will often make it easy and free to onboard your application or data, but charge exorbitant rates when you want to move them out (i.e. egress fees). Cloudflare will never charge for egress. Our pricing is simple, and we constantly aim to be the low-cost provider, no matter how large your business grows to be.

Dogfooding our own product

We’re excited about Workers not only because we’ve built our own infrastructure on it, but also because we’re seeing the incredible things others have built on it. In fact, we acquired a company built entirely on Workers at the end of last year, an Israeli start-up named Zaraz, which secures and accelerates third party web tools. Workers allowed Zaraz to replace the multiple network requests of each tool running on a website with one single request, effectively streamlining a messy web of extensions into a single lightweight application. This acquisition opened our eyes to the power of the global community that’s built on our platform, and left us motivated to help startups built on Workers find the funding, mentorship, and support needed to grow.

But wait, there’s more!

To make it even easier for startups to take advantage of all the benefits that Workers has to offer, applicants to the Workers Launchpad program who have raised less than $3 million in total external funding will automatically have the option to receive Cloudflare’s Startup Plan. This plan includes all the elements of Cloudflare’s Pro and Business Plans ($2,400 annual value) plus higher tiers of our Stream video product, our Teams Zero Trust security suite and the Workers platform. To make sure the full range of our developer platform is accessible to startups, we recently more than tripled the number of products available in this plan, which now includes email security, R2, Pages, KV, and many others.

Furthermore, all startups that apply by October 31, 2022, will be eligible to be selected for the Winter 2022 class of Workers Founders, which will unlock additional support, mentorship, and marketing opportunities. Being selected as a Workers Founder will get you a chance to practice your pitch with investors, engage with leaders from Cloudflare, and get advice on how to build a successful business from topics like recruiting to marketing, sales, and beyond during a virtual Workers Founders Bootcamp Week. The program will culminate in a virtual Demo Day, so you can show the world what you’ve been building. We’re leaning in to help promising entrepreneurs join us in our mission to help build a better Internet.

Helping make the Internet better for all

Accessibility and ease of use are core to everything we do at Cloudflare. We will always make our products and platforms so easy to use that even the smallest business or hobbyist can easily use them. We hope the Workers Launchpad funding program encourages entrepreneurs from all around the world, and from all backgrounds, to start building on Workers, and makes it easier for you to find the funding you need to build the business of your dreams.

Head to the Workers Launchpad page to apply and join the Cloudflare Developer Discord to engage with the Workers community. If you’re a VC that is interested in supporting the program, reach out to Workers-Launchpad@cloudflare.com.

Cloudflare is not providing any funding or making any funding decisions, and there is no guarantee that any particular company will receive funding through the program. All funding decisions will be made by the venture capital firms that participate in the program. Cloudflare is not a registered broker-dealer, investment adviser, or other similar intermediary.

If SaaS Products Sell Themselves, Why Do We Need Sales?

2022-09-27T06:36:38.000Z

Ignite – Use Firecracker VMs with Docker images

2022-09-26T23:50:13.000Z

Isolates, microVMs, and WebAssembly

2022-09-26T20:20:15.000Z

Cross Region Unikernels at the Edge

2022-09-26T18:46:58.000Z

Bytebase: 20-Person Startup, 30 SaaS Services, and $1,183 Monthly Bill

2022-09-26T12:47:00.000Z

SaaS services behind a startup

2022-09-26T12:47:00.000Z

Show HN: Open-Source Intercom with Help Center

2022-09-26T07:09:48.000Z

GitHub Actions Pitfalls

2022-09-25T10:32:04.000Z

GCP, AWS, and Azure ARM-based server performance comparison

2022-09-24T23:44:21.000Z

Ezno: a new TypeScript compiler

2022-09-23T11:38:22.000Z

Pingora, the proxy that connects Cloudflare to the Internet

2022-09-14T13:11:50.000Z

When to use Bazel?

2022-09-13T18:46:21.000Z

A development process startup founders should use to ship features weirdly fast

2022-09-12T21:08:33.000Z

Serving a high-performance blog website from memory only, using Rust

2022-09-12T12:40:41.000Z

Optimising Docker Layers for Better Caching with Nix (2018)

2022-09-09T15:26:50.000Z

Our five failed YC applications and one successful one

2022-09-08T20:00:54.000Z

Defeating eBPF Uprobe Monitoring

2022-09-08T16:44:32.000Z

Attacking Firecracker: AWS' MicroVM Monitor Written in Rust

2022-09-08T16:20:36.000Z

Fresh 1.1 – automatic JSX, plugins, DevTools, and more

2022-09-08T13:28:29.000Z

My VM is lighter (and safer) than your container (2017)

2022-09-08T12:26:46.000Z

The YC Summer 2022 Batch

2022-09-07T15:01:12.000Z

Riff, automatically provide external dependencies for Rust projects

2022-09-06T17:05:06.000Z

Show HN: Draw Anything – A Simple Stable Diffusion Playground

2022-09-05T17:16:31.000Z

We want to make Nix better

2022-09-02T17:03:29.000Z

In Praise of QEMU

2022-09-02T10:08:03.000Z

Cloudflare's abuse policies & approach

2022-08-31T13:13:09.000Z

JavaScript hydration is a workaround, not a solution

2022-08-30T14:34:36.000Z

Acorn: A simple application deployment framework for Kubernetes

2022-08-27T23:08:31.000Z

Built-in container support for the .NET SDK

2022-08-27T20:53:15.000Z

Developer Experience Infrastructure

2022-08-27T14:40:47.000Z

Ask HN: Should I checkmate my employer?

2022-08-27T14:09:16.000Z

A toolbox for a secure software supply chain

2022-08-26T07:40:36.000Z

Devbox: Instant, easy, and predictable shells and containers

2022-08-25T22:35:06.000Z

Show HN: Devbox – Easy, predictable shells and containers

2022-08-25T22:35:06.000Z

SSH commit verification now supported

2022-08-23T17:07:11.000Z

Show HN: Crawlee – The web scraping and browser automation library for Node.js

2022-08-23T06:25:39.000Z

Parsing SQL

2022-08-23T02:58:29.000Z

Show HN: Largest collection of pitch deck, videos and memos to fund my education

2022-08-22T16:32:27.000Z

Ambient Scotrail Beats – Relax to Scottish train announcements over low-fi beats

2022-08-20T21:56:23.000Z

The PlanetScale serverless driver for JavaScript

2022-08-18T16:13:03.000Z

Show HN: I spent a year designing a low profile, minimal mechanical keyboard

2022-08-18T09:24:13.000Z

I spent a year designing a low profile, minimal mechanical keyboard

2022-08-18T09:24:13.000Z

Nixpacks takes a source directory and produces an OCI compliant image

2022-08-17T20:43:26.000Z

Hooking Go from Rust - Hitchhiker’s Guide to the Go-laxy

2022-08-17T16:25:12.000Z

When docker images stop being portable

2022-08-16T21:38:45.000Z

Tup – an instrumenting file-based build system

2022-08-15T23:44:41.000Z

Timelock Encryption made possible and easy to use

2022-08-15T20:54:55.000Z

How Discord supercharges network disks for extreme low latency

2022-08-15T19:24:21.000Z

Big Changes Ahead for Deno

2022-08-15T12:06:13.000Z

Why use Paxos instead of Raft?

2022-08-15T10:32:17.000Z

Solo developer startups to $5k MRR

2022-08-15T09:11:03.000Z

Faster Protocol Buffers

2022-08-14T06:44:22.000Z

DragonFlydb: Cache Design

2022-08-13T09:47:51.000Z

Krunvm – Create MicroVMs from OCI Images

2022-08-13T08:40:14.000Z

Ideas for DataScript 2

2022-08-13T07:19:07.000Z

Direct host system calls from KVM

2022-08-12T22:22:10.000Z

Kubernetes Statefulsets Are Broken

2022-08-12T14:31:34.000Z

Fully Dockerized Linux kernel debugging environment

2022-08-11T10:42:10.000Z

Podman 4.2.0

2022-08-11T05:06:34.000Z

Can Applications Recover from Fsync Failures?

2022-08-10T18:10:51.000Z

Developer Experience Funnel

2022-08-08T17:06:24.000Z

Some notes on DynamoDB 2022 paper

2022-08-05T20:24:48.000Z

Stateless Kubernetes overlay networks with IPv6

2022-08-05T10:53:34.000Z

Minify your container by up to 30x to be more secure (free and open source)

2022-08-03T17:42:44.000Z

You Can't Guarantee Webhook Ordering

2022-08-03T16:17:08.000Z

Modus: A language for building Docker/OCI container images

2022-08-03T14:16:45.000Z

How to raise seed capital for your startup | DigitalOcean

2022-08-03T12:22:55.000Z

Startups are often filled with ambitious ideas, growth-hungry team members, and excellent positioning. But most share one common problem: they lack funding. In fact, nearly 30 percent of startups fail due to inadequate funding. This problem isn't isolated to startups. Eighty-two percent of business closures are directly tied to cash flow struggles. Luckily, there are a wide variety of resources startups can leverage to get funding during the early stages.

One way to get started is to raise seed funding.

Understanding seed funding

Startups looking for funding rarely have positive net cash flow, and many only have a few founders and a great plan. Only 2 in 5 startups are profitable at any point. And most startups struggle to find profits early. According to some sources, it takes around 2 to 3 years for the average startup to start generating profits. Many founders do not have the personal wealth to bootstrap their startups for years.

Because traditional funding levers (e.g., private equity firms, investment banks, hedge funds, etc.) aren't willing to fund unproven businesses, startups often look for a special type of funding called "seed" funding. As the name suggests, investors provide a seed (funding) in the hopes that the startup nurtures that seed money into a healthy tree (a successful startup). In return, the startup provides equity (usually between 5 to 20 percent) or convertible debt.

So, where do you find seed funding?

The "who's who" of seed funding

Generally, seed funding comes from one of the following sources:

Friends & family: The most common method of seed funding is family and friends. Many startup founders have friends or family members who also own businesses or invest. Also, many startups have former founders on the team, who may have friends or colleagues looking for seed opportunities. However, not all founders are well-connected. Luckily, there are multiple other avenues for these founders to secure funding.
Angels: Some investors prefer to work with startups. These people (who usually have a high personal net worth) are called "angels." Many angels advertise themselves, but there are also a few websites that help connect people directly with angels (see the resource section below).
Incubators: Founders with an idea (but without an actionable product or service) can join an incubator. These are essentially mentorship programs that provide a workspace, funding, networking, training, and resources to help startups succeed. You can expect to give away a percentage of your company to join an incubator. This is one of the best options for high-growth startups that need more than just funds.
Accelerators: Unlike incubators (which focus on end-to-end growth), accelerators support growth-driven startups with financing, mentorship, and product development in a short burst. Most accelerators attempt to pack many years' worth of business experience into a few months. Generally, you give away some equity and join multiple businesses all accelerating at the same time.
Venture capital: Traditional funding gateways like private equity firms prefer lower-risk investments. Venture capital firms thrive on high-risk, high-reward funding. And they almost solely fund startups. However, many startups save venture capital for later seed rounds. Incubators, angels, and accelerators provide unique benefits (i.e., networking, business support, education, etc.) that venture capital firms do not provide. However, startups with experienced founders often leverage venture capital as a hands-off way to raise initial funding.

Each of these funding levers provides unique benefits. Friends and family are often the easiest way to get quick funding. But at the risk of damaged relationships. Angels, incubators, and accelerators all provide a wealth of intangible benefits, but they can be challenging to get into, depending on the startup's business plan. And venture capital usually works best when your startup already has a well-established framework.

Resources to help you raise capital for your startup

Understanding who provides seed funding isn't the same as actually getting seed funding. Many startups exist outside of massive cities, and not all founders are well-connected. Thankfully, a significant amount of funding is now done digitally. There are many websites and services aimed at providing startups with the resources and experience they need to grow a successful company. These include:

Angel-finding websites: There are a wide variety of angel funding websites that directly connect founders with angels across the globe. Some of the most popular include:

Incubators: There are thousands of incubators across the United States alone, but some stand out more than others. The most successful incubators include:

Crowdfunding websites: Crowdfunding is a great way to quickly raise funds for your business. You essentially get funded by many different anonymous people in return for equity. While GoFundMe and Kickstarter are great for product-based startups, most new startups go through seed-based crowdfunding rounds on websites like:

Loans: You always have the option to take out loans and microloans. Small businesses can use SBA loans to get some quick capital, and there are a wide variety of unique loan options for startups. For example, solutions like Pipe give you up-front capital in return for recurring revenue (which may be ideal for some SaaS companies looking for hands-off funding). You can find loan options at your bank or through a variety of different websites.

In addition to these funding options, there are many resources for startups that aren't related to capital. Startup events, networking websites, forums, social media, and a wide variety of websites exist to help founders connect to other founders and business leaders. Growing a successful business takes more than money. You should always look to build connections and gain insights from industry leaders along the way.

Partner with us

DigitalOcean provides founders a variety of resources through Hatch, our global startup program. Sign up to learn how we can help you build and grow your startup through intelligent infrastructure solutions.

OpenVPN & WireGuard server at GitHub Actions: representative NAT traversal case

2022-08-03T08:23:57.000Z

DALL·E 2 prompt book [pdf]

2022-08-02T18:14:03.000Z

Using Firecracker and Go to run short-lived, untrusted code execution jobs

2022-08-02T12:01:53.000Z

Using Firecracker and Go to run short-lived, untrusted code execution jobs

2022-08-01T14:59:06.000Z

Bazel Repository Cache

2022-08-01T13:45:32.000Z

Docker and the OCI container ecosystem

2022-08-01T01:17:45.000Z

Show HN: Distributed SQLite on FoundationDB

2022-07-28T19:49:05.000Z

Show HN: Chunk – Code sandbox for back-end devs

2022-07-28T17:49:59.000Z

EdgeDB 2.0

2022-07-28T17:12:36.000Z

Aya: your tRusty eBPF companion

2022-07-28T16:51:28.000Z

S3 isn't getting cheaper

2022-07-28T16:17:51.000Z

A toy remote login server

2022-07-28T08:00:28.000Z

Hello! The other day we talked about what happened when you press a key in your terminal.

As a followup, I thought it might be fun to implement a program that’s like a tiny ssh server, but without the security. You can find it on github here, and I’ll explain how it works in this blog post.

the goal: “ssh” to a remote computer

Our goal is to be able to login to a remote computer and run commands, like you do with SSH or telnet.

The biggest difference between this program and SSH is that there’s literally no security (not even a password) – anyone who can make a TCP connection to the server can get a shell and run commands.

Obviously this is not a useful program in real life, but our goal is to learn a little more about how terminals works, not to write a useful program.

(I will run a version of it on the public internet for the next week though, you can see how to connect to it at the end of this blog post)

let’s start with the server!

We’re also going to write a client, but the server is the interesting part, so let’s start there. We’re going to write a server that listens on a TCP port (I picked 7777) and creates remote terminals for any client that connects to it to use.

When the server receives a new connection it needs to:

create a pseudoterminal for the client to use
start a bash shell process for the client to use
connect bash to the pseudoterminal
continuously copy information back and forth between the TCP connection and the pseudoterminal

I just said the word “pseudoterminal” a lot, so let’s talk about what that means.

what’s a pseudoterminal?

Okay, what the heck is a pseudoterminal?

A pseudoterminal is a lot like a bidirectional pipe or a socket – you have two ends, and they can both send and receive information. You can read more about the information being sent and received in what happens if you press a key in your terminal

Basically the idea is that on one end, we have a TCP connection, and on the other end, we have a bash shell. So we need to hook one part of the pseudoterminal up to the TCP connection and the other end to bash.

The two parts of the pseudoterminal are called:

the “pseudoterminal master”. This is the end we’re going to hook up to the TCP connection.
the “slave pseudoterminal device”. We’re going to set our bash shell’s stdout, stderr, and stdin to this.

Once they’re conected, we can communicate with bash over our TCP connection and we’ll have a remote shell!

why do we need this “pseudoterminal” thing anyway?

You might be wondering – Julia, if a pseudoterminal is kind of like a socket, why can’t we just set our bash shell’s stdout / stderr / stdin to the TCP socket?

And you can! We could write a TCP connection handler like this that does exactly that, it’s not a lot of code (server-notty.go).


func handle(conn net.Conn) {
	tty, _ := conn.(*net.TCPConn).File()
	// start bash with tcp connection as stdin/stdout/stderr
	cmd := exec.Command("bash")
	cmd.Stdin = tty
	cmd.Stdout = tty
	cmd.Stderr = tty
	cmd.Start()
}

It even kind of works – if we connect to it with nc localhost 7778, we can run commands and look at their output.

But there are a few problems. I’m not going to list all of them, just two.

problem 1: Ctrl + C doesn’t work

The way Ctrl + C works in a remote login session is

you press ctrl + c
That gets translated to 0x03 and sent through the TCP connection
The terminal receives it
the Linux kernel on the other end notes “hey, that was a Ctrl + C!”
Linux sends a SIGINT to the appropriate process (more on what the “appropriate process” is exactly later)

If the “terminal” is just a TCP connection, this doesn’t work, because when you send 0x04 to a TCP connection, Linux won’t magically send SIGINT to any process.

problem 2: top doesn’t work

When I try to run top in this shell, I get the error message top: failed tty get. If we strace it, we see this system call:

ioctl(2, TCGETS, 0x7ffec4e68d60)        = -1 ENOTTY (Inappropriate ioctl for device)

So top is running an ioctl on its output file descriptor (2) to get some information about the terminal. But Linux is like “hey, this isn’t a terminal!” and returns an error.

There are a bunch of other things that go wrong, but hopefully at this point you’re convinced that we actually need to set bash’s stdout/stderr to be a terminal, not some other thing like a socket.

So let’s start looking at the server code and see what creating a pseudoterminal actually looks like.

step 1: create a pseudoterminal

Here’s some Go code to create a pseudoterminal on Linux. This is copied from github.com/creack/pty, but I removed some of the error handling to make the logic a bit easier to follow:

pty, _ := os.OpenFile("/dev/ptmx", os.O_RDWR, 0)
sname := ptsname(p)
unlockpt(p)
tty, _ := os.OpenFile(sname, os.O_RDWR|syscall.O_NOCTTY, 0)

In English, what we’re doing is:

open /dev/ptmx to get the “pseudoterminal master” Again, that’s the part we’re going to hook up to the TCP connection
get the filename of the “slave pseudoterminal device”, which is going to be /dev/pts/13 or something.
“unlock” the pseudoterminal so that we can use it. I have no idea what the point of this is (why is it locked to begin with?) but you have to do it for some reason
open /dev/pts/13 (or whatever number we got from ptsname) to get the “slave pseudoterminal device”

What do those ptsname and unlockpt functions do? They just make some ioctl system calls to the Linux kernel. All of the communication with the Linux kernel about terminals seems to be through various ioctl system calls.

Here’s the code, it’s pretty short: (again, I just copied it from creack/pty)

func ptsname(f *os.File) string {
	var n uint32
	ioctl(f.Fd(), syscall.TIOCGPTN, uintptr(unsafe.Pointer(&n)))
	return "/dev/pts/" + strconv.Itoa(int(n))
}

func unlockpt(f *os.File) {
	var u int32
	// use TIOCSPTLCK with a pointer to zero to clear the lock
	ioctl(f.Fd(), syscall.TIOCSPTLCK, uintptr(unsafe.Pointer(&u)))
}

step 2: hook the pseudoterminal up to `bash`

The next thing we have to do is connect the pseudoterminal to bash. Luckily, that’s really easy – here’s the Go code for it! We just need to start a new process and set the stdin, stdout, and stderr to tty.

cmd := exec.Command("bash")
cmd.Stdin = tty
cmd.Stdout = tty
cmd.Stderr = tty
cmd.SysProcAttr = &syscall.SysProcAttr{
  Setsid: true,
}
cmd.Start()

Easy! Though – why do we need this Setsid: true thing, you might ask? Well, I tried commenting out that code to see what went wrong. It turns out that what goes wrong is – Ctrl + C doesn’t work anymore!

Setsid: true creates a new session for the new bash process. But why does that make Ctrl + C work? How does Linux know which process to send SIGINT to when you press Ctrl + C, and what does that have to do with sessions?

how does Linux know which process to send Ctrl + C to?

I found this pretty confusing, so I reached for my favourite book for learning about this kind of thing: the linux programming interface, specifically chapter 34 on process groups and sessions.

That chapter contains a few key facts: (#3, #4, and #5 are direct quotes from the book)

Every process has a session id and a process group id (which may or may not be the same as its PID)
A session is made up of multiple process groups
All of the processes in a session share a single controlling terminal.
A terminal may be the controlling terminal of at most one session.
At any point in time, one of the process groups in a session is the foreground process group for the terminal, and the others are background process groups.
When you press Ctrl+C in a terminal, SIGINT gets sent to all the processes in the foreground process group

What’s a process group? Well, my understanding is that:

processes in the same pipe x | y | z are in the same process group
processes you start on the same shell line (x && y && z) are in the same process group
child processes are by default in the same process group, unless you explicitly decide otherwise

I didn’t know most of this (I had no idea processes had a session ID!) so this was kind of a lot to absorb. I tried to draw a sketchy ASCII art diagram of the situation

(maybe)  terminal --- session --- process group --- process
                               |                 |- process
                               |                 |- process
                               |- process group 
                               |
                               |- process group

So when we press Ctrl+C in a terminal, here’s what I think happens:

\x04 gets written to the “pseudoterminal master” of a terminal
Linux finds the session for that terminal (if it exists)
Linux find the foreground process group for that session
Linux sends SIGINT

If we don’t create a new session for our new bash process, our new pseudoterminal actually won’t have any session associated with it, so nothing happens when we press Ctrl+C. But if we do create a new session, then the new pseudoterminal will have the new session associated with it.

how to get a list of all your sessions

As a quick aside, if you want to get a list of all the sessions on your Linux machine, grouped by session, you can run:

$ ps -eo user,pid,pgid,sess,cmd | sort -k3

This includes the PID, process group ID, and session ID. As an example of the output, here are the two processes in the pipeline:

bork       58080   58080   57922 ps -eo user,pid,pgid,sess,cmd
bork       58081   58080   57922 sort -k3

You can see that they share the same process group ID and session ID, but of course they have different PIDs.

That was kind of a lot but that’s all we’re going to say about sessions and process groups in this post. Let’s keep going!

step 3: set the window size

We need to tell the terminal how big to be!

Again, I just copied this from creack/pty. I decided to hardcode the size to 80x24.

Setsize(tty, &Winsize{
		Cols: 80,
		Rows: 24,
	})

Like with getting the terminal’s pts filename and unlocking it, setting the size is just one ioctl system call:

func Setsize(t *os.File, ws *Winsize) {
	ioctl(t.Fd(), syscall.TIOCSWINSZ, uintptr(unsafe.Pointer(ws)))
}

Pretty simple! We could do something smarter and get the real window size, but I’m too lazy.

step 4: copy information between the TCP connection and the pseudoterminal

As a reminder, our rough steps to set up this remote login server were:

create a pseudoterminal for the client to use
start a bash shell process
connect bash to the pseudoterminal
continuously copy information back and forth between the TCP connection and the pseudoterminal

We’ve done 1, 2, and 3, now we just need to ferry information between the TCP connection and the pseudoterminal.

There are two io.Copy calls, one to copy the input from the tcp connection, and one to copy the output to the TCP connection. Here’s what the code looks like:

	go func() {
			io.Copy(pty, conn)
	}()
  io.Copy(conn, pty)

The first one is in a goroutine just so they can both run in parallel.

Pretty simple!

step 5: exit when we’re done

I also added a little bit of code to close the TCP connection when the command exits

go func() {
  cmd.Wait()
  conn.Close()
}()

And that’s it for the server! You can see all of the Go code here: server.go.

next: write a client

Next, we have to write a client. This is a lot easier than the server because we don’t need to do quite as much terminal setup. There are just 3 steps:

Put the terminal into raw mode
copy stdin/stdout to the TCP connection
reset the terminal

client step 1: put the terminal into “raw” mode

We need to put the client terminal into “raw” mode so that every time you press a key, it gets sent to the TCP connection immediately. If we don’t do this, everything will only get sent when you press enter.

“Raw mode” isn’t actually a single thing, it’s a bunch of flags that you want to turn off. There’s a good tutorial explaining all the flags we have to turn off called Entering raw mode.

Like everything else with terminals, this requires ioctl system calls. In this case we get the terminal’s current settings, modify them, and save the old settings so that we can restore them later.

I figured out how to do this in Go by going to grep.app and typing in syscall.TCSETS to find some other Go code that was doing the same thing.

func MakeRaw(fd uintptr) syscall.Termios {
	// from https://github.com/getlantern/lantern/blob/devel/archive/src/golang.org/x/crypto/ssh/terminal/util.go
	var oldState syscall.Termios
	ioctl(fd, syscall.TCGETS, uintptr(unsafe.Pointer(&oldState)))

	newState := oldState
	newState.Iflag &^= syscall.ISTRIP | syscall.INLCR | syscall.ICRNL | syscall.IGNCR | syscall.IXON | syscall.IXOFF
	newState.Lflag &^= syscall.ECHO | syscall.ICANON | syscall.ISIG
	ioctl(fd, syscall.TCSETS, uintptr(unsafe.Pointer(&newState)))
	return oldState
}

client step 2: copy stdin/stdout to the TCP connection

This is exactly like what we did with the server. It’s very little code:

go func() {
		io.Copy(conn, os.Stdin)
	}()
	io.Copy(os.Stdout, conn)

client step 3: restore the terminal’s state

We can put the terminal back into the mode it started in like this (another ioctl!):

func Restore(fd uintptr, oldState syscall.Termios) {
	ioctl(fd, syscall.TCSETS, uintptr(unsafe.Pointer(&oldState)))
}

we did it!

We have written a tiny remote login server that lets anyone log in! Hooray!

Obviously this has zero security so I’m not going to talk about that aspect.

it’s running on the public internet! you can try it out!

For the next week or so I’m going to run a demo of this on the internet at tetris.jvns.ca. It runs tetris instead of a shell because I wanted to avoid abuse, but if you want to try it with a shell you can run it on your own computer :).

If you want to try it out, you can use netcat as a client instead of the custom Go client program we wrote, because copying information to/from a TCP connection is what netcat does. Here’s how:

stty raw -echo && nc tetris.jvns.ca 7777 && stty sane

This will let you play a terminal tetris game called tint.

You can also use the client.go program and run go run client.go tetris.jvns.ca 7777.

this is not a good protocol

This protocol where we just copy bytes from the TCP connection to the terminal and nothing else is not good because it doesn’t allow us to send over information information like the terminal or the actual window size of the terminal.

I thought about implementing telnet’s protocol so that we could use telnet as a client, but I didn’t feel like figuring out how telnet works so I didn’t. (the server 30% works with telnet as is, but a lot of things are broken, I don’t quite know why, and I didn’t feel like figuring it out)

it’ll mess up your terminal a bit

As a warning: using this server to play tetris will probably mess up your terminal a bit because it sets the window size to 80x24. To fix that I just closed the terminal tab after running that command.

If we wanted to fix this for real, we’d need to restore the window size after we’re done, but then we’d need a slightly more real protocol than “just blindly copy bytes back and forth with TCP” and I didn’t feel like doing that.

Also it sometimes takes a second to disconnect after the program exits for some reason, I’m not sure why that is.

other tiny projects

That’s all! There are a couple of other similar toy implementations of programs I’ve written here:

Shipping Multi-Tenant SaaS Using Postgres Row-Level Security

2022-07-26T18:16:22.000Z

Non-Obvious Docker Uses

2022-07-24T14:44:10.000Z

Squashfs binary format

2022-07-24T12:39:26.000Z

Xz format considered inadequate for long-term archiving (2016)

2022-07-24T04:52:02.000Z

The audacious PR plot that seeded doubt about climate change

2022-07-23T23:00:05.000Z

Vercel Build Output API: Infrastructure-as-Filesystem

2022-07-22T14:53:23.000Z

Show HN: Pg_jsonschema – A Postgres extension for JSON validation

2022-07-21T14:31:20.000Z

Hetzner to Offer First Arm-Based Dedicated Servers in Europe

2022-07-20T11:37:04.000Z

Soft Deletion Probably Isn't Worth It

2022-07-19T18:35:39.000Z

Using Apache Kafka to process 1 trillion inter-service messages

2022-07-19T13:00:00.000Z

Cloudflare has been using Kafka in production since 2014. We have come a long way since then, and currently run 14 distinct Kafka clusters, across multiple data centers, with roughly 330 nodes. Between them, over a trillion messages have been processed over the last eight years.

Cloudflare uses Kafka to decouple microservices and communicate the creation, change or deletion of various resources via a common data format in a fault-tolerant manner. This decoupling is one of many factors that enables Cloudflare engineering teams to work on multiple features and products concurrently.

We learnt a lot about Kafka on the way to one trillion messages, and built some interesting internal tools to ease adoption that will be explored in this blog post. The focus in this blog post is on inter-application communication use cases alone and not logging (we have other Kafka clusters that power the dashboards where customers view statistics that handle more than one trillion messages each day). I am an engineer on the Application Services team and our team has a charter to provide tools/services to product teams, so they can focus on their core competency which is delivering value to our customers.

In this blog I’d like to recount some of our experiences in the hope that it helps other engineering teams who are on a similar journey of adopting Kafka widely.

Tooling

One of our Kafka clusters is creatively named Messagebus. It is the most general purpose cluster we run, and was created to:

Prevent data silos;
Enable services to communicate more clearly with basically zero integration cost (more on how we achieved this below);
Encourage the use of a self-documenting communication format and therefore removing the problem of out of date documentation.

To make it as easy to use as possible and to encourage adoption, the Application Services team created two internal projects. The first is unimaginatively named Messagebus-Client. Messagebus-Client is a Go library that wraps the fantastic Shopify Sarama library with an opinionated set of configuration options and the ability to manage the rotation of mTLS certificates.

The success of this project is also somewhat its downfall. By providing a ready-to-go Kafka client, we ensured teams got up and running quickly, but we also abstracted some core concepts of Kafka a little too much, meaning that small unassuming configuration changes could have a big impact.

One such example led to partition skew (a large portion of messages being directed towards a single partition, meaning we were not processing messages in real time; see the chart below). One drawback of Kafka is you can only have one consumer per partition, so when incidents do occur, you can’t trivially scale your way to faster throughput.

That also means before your service hits production it is wise to do some back of the napkin math to figure out what throughput might look like, otherwise you will need to add partitions later. We have since amended our library to make events like the below less likely.

The reception for the Messagebus-Client has been largely positive. We spent time as a team to understand what the predominant use cases were, and took the concept one step further to build out what we call the connector framework.

Connectors

The connector framework is based on Kafka-connectors and allows our engineers to easily spin up a service that can read from a system of record and push it somewhere else (such as Kafka, or even Cloudflare’s own Quicksilver). To make this as easy as possible, we use Cookiecutter templating to allow engineers to enter a few parameters into a CLI and in return receive a ready to deploy service.

We provide the ability to configure data pipelines via environment variables. For simple use cases, we provide the functionality out of the box. However, extending the readers, writers and transformations is as simple as satisfying an interface and “registering” the new entry.

For example, adding the environment variables:

READER=kafka
TRANSFORMATIONS=topic_router:topic1,topic2|pf_edge
WRITER=quicksilver

will:

Read messages from Kafka topic “topic1” and “topic2”;
Transform the message using a transformation function called “pf_edge” which maps the request from a Kafka protobuf to a Quicksilver request;
Write the result to Quicksilver.

Connectors come readily baked with basic metrics and alerts, so teams know they can move to production quickly but with confidence.

Below is a diagram of how one team used our connector framework to read from the Messagebus cluster and write to various other systems. This is orchestrated by a system the Application Service team runs called Communication Preferences Service (CPS). Whenever a user opts in/out of marketing emails or changes their language preferences on cloudflare.com, they are calling CPS which ensures those settings are reflected in all the relevant systems.

Strict Schemas

Alongside the Messagebus-Client library, we also provide a repo called Messagebus Schema. This is a schema registry for all message types that will be sent over our Messagebus cluster. For message format, we use protobuf and have been very happy with that decision. Previously, our team had used JSON for some of our kafka schemas, but we found it much harder to enforce forward and backwards compatibility, as well as message sizes being substantially larger than the protobuf equivalent. Protobuf provides strict message schemas (including type safety), the forward and backwards compatibility we desired, the ability to generate code in multiple languages as well as the files being very human-readable.

We encourage heavy commentary before approving a merge. Once merged, we use prototool to do breaking change detection, enforce some stylistic rules and to generate code for various languages (at time of writing it's just Go and Rust, but it is trivial to add more).

An example Protobuf message in our schema

Furthermore, in Messagebus Schema we store a mapping of proto messages to a team, alongside that team’s chat room in our internal communication tool. This allows us to escalate issues to the correct team easily when necessary.

One important decision we made for the Messagebus cluster is to only allow one proto message per topic. This is configured in Messagebus Schema and enforced by the Messagebus-Client. This was a good decision to enable easy adoption, but it has led to numerous topics existing. When you consider that for each topic we create, we add numerous partitions and replicate them with a replication factor of at least three for resilience, there is a lot of potential to optimize compute for our lower throughput topics.

Observability

Making it easy for teams to observe Kafka is essential for our decoupled engineering model to be successful. We therefore have automated metrics and alert creation wherever we can to ensure that all the engineering teams have a wealth of information available to them to respond to any issues that arise in a timely manner.

We use Salt to manage our infrastructure configuration and follow a Gitops style model, where our repo holds the source of truth for the state of our infrastructure. To add a new Kafka topic, our engineers make a pull request into this repo and add a couple of lines of YAML. Upon merge, the topic and an alert for high lag (where lag is defined as the difference in time between the last committed offset being read and the last produced offset being produced) will be created. Other alerts can (and should) be created, but this is left to the discretion of application teams. The reason we automatically generate alerts for high lag is that this simple alert is a great proxy for catching a high amount of issues including:

Your consumer isn’t running.
Your consumer cannot keep up with the amount of throughput or there is an anomalous amount of messages being produced to your topic at this time.
Your consumer is misbehaving and not acknowledging messages.

For metrics, we use Prometheus and display them with Grafana. For each new topic created, we automatically provide a view into production rate, consumption rate and partition skew by producer/consumer. If an engineering team is called out, within the alert message is a link to this Grafana view.

In our Messagebus-Client, we expose some metrics automatically and users get the ability to extend them further. The metrics we expose by default are:

For producers:

Messages successfully delivered.
Message failed to deliver.

For consumer:

Messages successfully consumed.
Message consumption errors.

Some teams use these for alerting on a significant change in throughput, others use them to alert if no messages are produced/consumed in a given time frame.

A Practical Example

As well as providing the Messagebus framework, the Application Services team looks for common concerns within Engineering and looks to solve them in a scalable, extensible way which means other engineering teams can utilize the system and not have to build their own (thus meaning we are not building lots of disparate systems that are only slightly different).

One example is the Alert Notification System (ANS). ANS is the backend service for the “Notifications” tab in the Cloudflare dashboard. You may have noticed over the past 12 months that new alert and policy types have been made available to customers very regularly. This is because we have made it very easy for other teams to do this. The approach is:

Create a new entry into ANS’s configuration YAML (We use CUE lang to validate the configuration as part of our continuous integration process);
Import our Messagebus-Client into your code base;
Emit a message to our alert topic when an event of interest takes place.

That’s it! The producer team now has a means for customers to configure granular alerting policies for their new alert that includes being able to dispatch them via Slack, Google Chat or a custom webhook, PagerDuty or email (by both API and dashboard). Retrying and dead letter messages are managed for them, and a whole host of metrics are made available, all by making some very small changes.

What’s Next?

Usage of Kafka (and our Messagebus tools) is only going to increase at Cloudflare as we continue to grow, and as a team we are committed to making the tooling around Messagebus easy to use, customizable where necessary and (perhaps most importantly) easy to observe. We regularly take feedback from other engineers to help improve the Messagebus-Client (we are on the fifth version now) and are currently experimenting with abstracting the intricacies of Kafka away completely and allowing teams to use gRPC to stream messages to Kafka. Blog post on the success/failure of this to follow!

If you're interested in building scalable services and solving interesting technical problems, we are hiring engineers on our team in Austin, and Remote US.

An ex-Googler's guide to dev tools

2022-07-18T02:50:46.000Z

$9.99/Month

2022-07-17T18:28:50.000Z

The slow march of progress in programming language tooling

2022-07-17T16:32:12.000Z

Deno’s July 13th incident update

2022-07-16T08:07:26.000Z

I deleted 78% of my Redis container and it still works

2022-07-16T07:57:09.000Z

Microsoft open sources Salus software bill of materials (SBOM) generation tool

2022-07-15T05:07:18.000Z

Show HN: Porting OpenBSD Pledge() to Linux

2022-07-14T14:52:42.000Z

Show HN: Permify – Open-source authorization service based on Google Zanzibar

2022-07-14T14:38:14.000Z

Replibyte – Seed your database with real data

2022-07-10T18:39:08.000Z

Monitoring tiny web services

2022-07-09T17:29:02.000Z

Oxide Builds Servers

2022-07-09T17:15:58.000Z

The Visual Studio Code Server

2022-07-08T09:41:32.000Z

SOC2: The Screenshots Will Continue Until Security Improves

2022-07-07T19:14:19.000Z

github.com/zebp/wasi-example-zstd

Announcing support for WASI on Cloudflare Workers

2022-07-07T16:09:43.000Z

Today, we are announcing experimental support for WASI (the WebAssembly System Interface) on Cloudflare Workers and support within wrangler2 to make it a joy to work with. We continue to be incredibly excited about the entire WebAssembly ecosystem and are eager to adopt the standards as they are developed.

A Quick Primer on WebAssembly

So what is WASI anyway? To understand WASI, and why we’re excited about it, it’s worth a quick recap of WebAssembly, and the ecosystem around it.

WebAssembly promised us a future in which code written in compiled languages could be compiled to a common binary format and run in a secure sandbox, at near native speeds. While WebAssembly was designed with the browser in mind, the model rapidly extended to server-side platforms such as Cloudflare Workers (which has supported WebAssembly since 2017).

WebAssembly was originally designed to run alongside Javascript, and requires developers to interface directly with Javascript in order to access the world outside the sandbox. To put it another way, WebAssembly does not provide any standard interface for I/O tasks such as interacting with files, accessing the network, or reading the system clock. This means if you want to respond to an event from the outside world, it's up to the developer to handle that event in JavaScript, and directly call functions exported from the WebAssembly module. Similarly, if you want to perform I/O from within WebAssembly, you need to implement that logic in Javascript and import it into the WebAssembly module.

Custom toolchains such as Emscripten or libraries such as wasm-bindgen have emerged to make this easier, but they are language specific and add a tremendous amount of complexity and bloat. We've even built our own library, workers-rs, using wasm-bindgen that attempts to make writing applications in Rust feel native within a Worker – but this has proven not only difficult to maintain, but requires developers to write code that is Workers specific, and is not portable outside the Workers ecosystem.

We need more.

The WebAssembly System Interface (WASI)

WASI aims to provide a standard interface that any language compiling to WebAssembly can target. You can read the original post by Lin Clark here, which gives an excellent introduction – code cartoons and all. In a nutshell, Lin describes WebAssembly as an assembly language for a 'conceptual machine', whereas WASI is a systems interface for a ‘conceptual operating system.’

This standardization of the system interface has paved the way for existing toolchains to cross-compile existing codebases to the wasm32-wasi target. A tremendous amount of progress has already been made, specifically within Clang/LLVM via the wasi-sdk and Rust toolchains. These toolchains leverage a version of Libc, which provides POSIX standard API calls, that is built on top of WASI 'system calls.' There are even basic implementations in more fringe toolchains such as TinyGo and SwiftWasm.

Practically speaking, this means that you can now write applications that not only interoperate with any WebAssembly runtime implementing the standard, but also any POSIX compliant system! This means the exact same 'Hello World!' that runs on your local Linux/Mac/Windows WSL machine.

Show me the code

WASI sounds great, but does it actually make my life easier? You tell us. Let’s run through an example of how this would work in practice.

First, let’s generate a basic Rust “Hello, world!” application, compile, and run it.

$ cargo new hello_world
$ cd ./hello_world
$ cargo build --release
   Compiling hello_world v0.1.0 (/Users/benyule/hello_world)
    Finished release [optimized] target(s) in 0.28s
$ ./target/release/hello_world
Hello, world!

It doesn’t get much simpler than this. You’ll notice we only define a main() function followed by a println to stdout.

fn main() {
    println!("Hello, world!");
}

Now, let’s take the exact same program and compile against the wasm32-wasi target, and run it in an ‘off the shelf’ wasm runtime such as Wasmtime.

$ cargo build --target wasm32-wasi --release
$ wasmtime target/wasm32-wasi/release/hello_world.wasm

Hello, world!

Neat! The same code compiles and runs in multiple POSIX environments.

Finally, let’s take the binary we just generated for Wasmtime, but instead publish it to Workers using Wrangler2.

$ npx wrangler@wasm dev target/wasm32-wasi/release/hello_world.wasm
$ curl http://localhost:8787/

Hello, world!

Unsurprisingly, it works! The same code is compatible in multiple POSIX environments and the same binary is compatible across multiple WASM runtimes.

Running your CLI apps in the cloud

The attentive reader may notice that we played a small trick with the HTTP request made via cURL. In this example, we actually stream stdin and stdout to/from the Worker using the HTTP request and response body respectively. This pattern enables some really interesting use cases, specifically, programs designed to run on the command line can be deployed as 'services' to the cloud.

‘Hexyl’ is an example that works completely out of the box. Here, we ‘cat’ a binary file on our local machine and ‘pipe’ the output to curl, which will then POST that output to our service and stream the result back. Following the steps we used to compile our 'Hello World!', we can compile hexyl.

$ git clone git@github.com:sharkdp/hexyl.git
$ cd ./hexyl
$ cargo build --target wasm32-wasi --release

And without further modification we were able to take a real-world program and create something we can now run or deploy. Again, let's tell wrangler2 to preview hexyl, but this time give it some input.

$ npx wrangler@wasm dev target/wasm32-wasi/release/hexyl.wasm
$ echo "Hello, world\!" | curl -X POST --data-binary @- http://localhost:8787

┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 48 65 6c 6c 6f 20 77 6f ┊ 72 6c 64 21 0a          │Hello wo┊rld!_   │
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘

Give it a try yourself by hitting https://hexyl.examples.workers.dev.

echo "Hello world\!" | curl https://hexyl.examples.workers.dev/ -X POST --data-binary @- --output -

A more useful example, but requires a bit more work, would be to deploy a utility such as swc (swc.rs), to the cloud and use it as an on demand JavaScript/TypeScript transpilation service. Here, we have a few extra steps to ensure that the compiled output is as small as possible, but it otherwise runs out-of-the-box. Those steps are detailed in github.com/zebp/wasi-example-swc, but for now let’s gloss over that and interact with the hosted example.

$ echo "const x = (x, y) => x * y;" | curl -X POST --data-binary @- https://swc-wasi.examples.workers.dev/ --output -

var x=function(a,b){return a*b}

Finally, we can also do the same with C/C++, but requires a little more lifting to get our Makefile right. Here we show an example of compiling zstd and uploading it as a streaming compression service.

$ echo "Hello world\!" | curl https://zstd.examples.workers.dev/ -s -X POST --data-binary @- | file -

What if I want to use WASI from within a JavaScript Worker?

Wrangler can make it really easy to deploy code without having to worry about the Workers ecosystem, but in some cases you may actually want to invoke your WASI based WASM module from Javascript. This can be achieved with the following simple boilerplate. An updated README will be kept at github.com/cloudflare/workers-wasi.

import { WASI } from "@cloudflare/workers-wasi";
import demoWasm from "./demo.wasm";

export default {
  async fetch(request, _env, ctx) {
    // Creates a TransformStream we can use to pipe our stdout to our response body.
    const stdout = new TransformStream();
    const wasi = new WASI({
      args: [],
      stdin: request.body,
      stdout: stdout.writable,
    });

    // Instantiate our WASM with our demo module and our configured WASI import.
    const instance = new WebAssembly.Instance(demoWasm, {
      wasi_snapshot_preview1: wasi.wasiImport,
    });

    // Keep our worker alive until the WASM has finished executing.
    ctx.waitUntil(wasi.start(instance));

    // Finally, let's reply with the WASM's output.
    return new Response(stdout.readable);
  },
};

Now with our JavaScript boilerplate and wasm, we can easily deploy our worker with Wrangler’s WASM feature.

$ npx wrangler publish
Total Upload: 473.89 KiB / gzip: 163.79 KiB
Uploaded wasi-javascript (2.75 sec)
Published wasi-javascript (0.30 sec)
  wasi-javascript.zeb.workers.dev

Back to the future

For those of you who have been around for the better part of the past couple of decades, you may notice this looks very similar to RFC3875, better known as CGI (The Common Gateway Interface). While our example here certainly does not conform to the specification, you can imagine how this can be extended to turn the stdin of a basic 'command line' application into a full-blown http handler.

We are thrilled to learn where developers take this from here. Share what you build with us on Discord or Twitter!

...
We protect entire corporate networks, help customers build Internet-scale applications efficiently, accelerate any website or Internet application, ward off DDoS attacks, keep hackers at bay, and can help you on your journey to Zero Trust.

Visit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.To learn more about our mission to help build a better Internet, start here. If you’re looking for a new career direction, check out our open positions.

Designing a Better Pricing Page

2022-07-07T10:27:50.000Z

Bun: Fast JavaScript runtime, transpiler, and NPM client written in Zig

2022-07-05T20:41:53.000Z

Show HN: Credentials dumper for Linux using eBPF

2022-07-05T14:44:55.000Z

Kubernetes is a red flag signalling premature optimisation

2022-07-04T07:27:03.000Z

macOS: App sandboxing via sandbox-exec

2022-07-04T03:32:37.000Z

macOS: App sandboxing via sandbox-exec (2020)

2022-07-04T03:32:37.000Z

The End of CI

2022-07-03T05:14:39.000Z

rsync, article 3: How does rsync work?

2022-07-02T12:41:23.000Z

Show HN: Beast – The Build System

2022-07-02T11:04:12.000Z

You're Not Allowed to Have the Best Sunscreens in the World

2022-07-01T23:19:01.000Z

Optimizing TCP for high WAN throughput while preserving low latency

2022-07-01T13:00:01.000Z

Here at Cloudflare we're constantly working on improving our service. Our engineers are looking at hundreds of parameters of our traffic, making sure that we get better all the time.

One of the core numbers we keep a close eye on is HTTP request latency, which is important for many of our products. We regard latency spikes as bugs to be fixed. One example is the 2017 story of "Why does one NGINX worker take all the load?", where we optimized our TCP Accept queues to improve overall latency of TCP sockets waiting for accept().

Performance tuning is a holistic endeavor, and we monitor and continuously improve a range of other performance metrics as well, including throughput. Sometimes, tradeoffs have to be made. Such a case occurred in 2015, when a latency spike was discovered in our processing of HTTP requests. The solution at the time was to set tcp_rmem to 4 MiB, which minimizes the amount of time the kernel spends on TCP collapse processing. It was this collapse processing that was causing the latency spikes. Later in this post we discuss TCP collapse processing in more detail.

The tradeoff is that using a low value for tcp_rmem limits TCP throughput over high latency links. The following graph shows the maximum throughput as a function of network latency for a window size of 2 MiB. Note that the 2 MiB corresponds to a tcp_rmem value of 4 MiB due to the tcp_adv_win_scale setting in effect at the time.

For the Cloudflare products then in existence, this was not a major problem, as connections terminate and content is served from nearby servers due to our BGP anycast routing.

Since then, we have added new products, such as Magic WAN, WARP, Spectrum, Gateway, and others. These represent new types of use cases and traffic flows.

For example, imagine you're a typical Magic WAN customer. You have connected all of your worldwide offices together using the Cloudflare global network. While Time to First Byte still matters, Magic WAN office-to-office traffic also needs good throughput. For example, a lot of traffic over these corporate connections will be file sharing using protocols such as SMB. These are elephant flows over long fat networks. Throughput is the metric every eyeball watches as they are downloading files.

We need to continue to provide world-class low latency while simultaneously providing high throughput over high-latency connections.

Before we begin, let’s introduce the players in our game.

TCP receive window is the maximum number of unacknowledged user payload bytes the sender should transmit (bytes-in-flight) at any point in time. The size of the receive window can and does go up and down during the course of a TCP session. It is a mechanism whereby the receiver can tell the sender to stop sending if the sent packets cannot be successfully received because the receive buffers are full. It is this receive window that often limits throughput over high-latency networks.

net.ipv4.tcp_adv_win_scale is a (non-intuitive) number used to account for the overhead needed by Linux to process packets. The receive window is specified in terms of user payload bytes. Linux needs additional memory beyond that to track other data associated with packets it is processing.

The value of the receive window changes during the lifetime of a TCP session, depending on a number of factors. The maximum value that the receive window can be is limited by the amount of free memory available in the receive buffer, according to this table:

tcp_adv_win_scale	TCP window size
4	15/16 * available memory in receive buffer
3	⅞ * available memory in receive buffer
2	¾ * available memory in receive buffer
1	½ * available memory in receive buffer
0	available memory in receive buffer
-1	½ * available memory in receive buffer
-2	¼ * available memory in receive buffer
-3	⅛ * available memory in receive buffer

We can intuitively (and correctly) understand that the amount of available memory in the receive buffer is the difference between the used memory and the maximum limit. But what is the maximum size a receive buffer can be? The answer is sk_rcvbuf.

sk_rcvbuf is a per-socket field that specifies the maximum amount of memory that a receive buffer can allocate. This can be set programmatically with the socket option SO_RCVBUF. This can sometimes be useful to do, for localhost TCP sessions, for example, but in general the use of SO_RCVBUF is not recommended.

So how is sk_rcvbuf set? The most appropriate value for that depends on the latency of the TCP session and other factors. This makes it difficult for L7 applications to know how to set these values correctly, as they will be different for every TCP session. The solution to this problem is Linux autotuning.

Linux autotuning

Linux autotuning is logic in the Linux kernel that adjusts the buffer size limits and the receive window based on actual packet processing. It takes into consideration a number of things including TCP session RTT, L7 read rates, and the amount of available host memory.

Autotuning can sometimes seem mysterious, but it is actually fairly straightforward.

The central idea is that Linux can track the rate at which the local application is reading data off of the receive queue. It also knows the session RTT. Because Linux knows these things, it can automatically increase the buffers and receive window until it reaches the point at which the application layer or network bottleneck links are the constraint on throughput (and not host buffer settings). At the same time, autotuning prevents slow local readers from having excessively large receive queues. The way autotuning does that is by limiting the receive window and its corresponding receive buffer to an appropriate size for each socket.

The values set by autotuning can be seen via the Linux “ss” command from the iproute package (e.g. “ss -tmi”). The relevant output fields from that command are:

Recv-Q is the number of user payload bytes not yet read by the local application.

rcv_ssthresh is the window clamp, a.k.a. the maximum receive window size. This value is not known to the sender. The sender receives only the current window size, via the TCP header field. A closely-related field in the kernel, tp->window_clamp, is the maximum window size allowable based on the amount of available memory. rcv_ssthresh is the receiver-side slow-start threshold value.

skmem_r is the actual amount of memory that is allocated, which includes not only user payload (Recv-Q) but also additional memory needed by Linux to process the packet (packet metadata). This is known within the kernel as sk_rmem_alloc.

Note that there are other buffers associated with a socket, so skmem_r does not represent the total memory that a socket might have allocated. Those other buffers are not involved in the issues presented in this post.

skmem_rb is the maximum amount of memory that could be allocated by the socket for the receive buffer. This is higher than rcv_ssthresh to account for memory needed for packet processing that is not packet data. Autotuning can increase this value (up to tcp_rmem max) based on how fast the L7 application is able to read data from the socket and the RTT of the session. This is known within the kernel as sk_rcvbuf.

rcv_space is the high water mark of the rate of the local application reading from the receive buffer during any RTT. This is used internally within the kernel to adjust sk_rcvbuf.

Earlier we mentioned a setting called tcp_rmem. net.ipv4.tcp_rmem consists of three values, but in this document we are always referring to the third value (except where noted). It is a global setting that specifies the maximum amount of memory that any TCP receive buffer can allocate, i.e. the maximum permissible value that autotuning can use for sk_rcvbuf. This is essentially just a failsafe for autotuning, and under normal circumstances should play only a minor role in TCP memory management.

It’s worth mentioning that receive buffer memory is not preallocated. Memory is allocated based on actual packets arriving and sitting in the receive queue. It’s also important to realize that filling up a receive queue is not one of the criteria that autotuning uses to increase sk_rcvbuf. Indeed, preventing this type of excessive buffering (bufferbloat) is one of the benefits of autotuning.

What’s the problem?

The problem is that we must have a large TCP receive window for high BDP sessions. This is directly at odds with the latency spike problem mentioned above.

Something has to give. The laws of physics (speed of light in glass, etc.) dictate that we must use large window sizes. There is no way to get around that. So we are forced to solve the latency spikes differently.

A brief recap of the latency spike problem

Sometimes a TCP session will fill up its receive buffers. When that happens, the Linux kernel will attempt to reduce the amount of memory the receive queue is using by performing what amounts to a “defragmentation” of memory. This is called collapsing the queue. Collapsing the queue takes time, which is what drives up HTTP request latency.

We do not want to spend time collapsing TCP queues.

Why do receive queues fill up to the point where they hit the maximum memory limit? The usual situation is when the local application starts out reading data from the receive queue at one rate (triggering autotuning to raise the max receive window), followed by the local application slowing down its reading from the receive queue. This is valid behavior, and we need to handle it correctly.

Selecting sysctl values

Before exploring solutions, let’s first decide what we need as the maximum TCP window size.

As we have seen above in the discussion about BDP, the window size is determined based upon the RTT and desired throughput of the connection.

Because Linux autotuning will adjust correctly for sessions with lower RTTs and bottleneck links with lower throughput, all we need to be concerned about are the maximums.

For latency, we have chosen 300 ms as the maximum expected latency, as that is the measured latency between our Zurich and Sydney facilities. It seems reasonable enough as a worst-case latency under normal circumstances.

For throughput, although we have very fast and modern hardware on the Cloudflare global network, we don’t expect a single TCP session to saturate the hardware. We have arbitrarily chosen 3500 mbps as the highest supported throughput for our highest latency TCP sessions.

The calculation for those numbers results in a BDP of 131MB, which we round to the more aesthetic value of 128 MiB.

Recall that allocation of TCP memory includes metadata overhead in addition to packet data. The ratio of actual amount of memory allocated to user payload size varies, depending on NIC driver settings, packet size, and other factors. For full-sized packets on some of our hardware, we have measured average allocations up to 3 times the packet data size. In order to reduce the frequency of TCP collapse on our servers, we set tcp_adv_win_scale to -2. From the table above, we know that the max window size will be ¼ of the max buffer space.

We end up with the following sysctl values:

net.ipv4.tcp_rmem = 8192 262144 536870912
net.ipv4.tcp_wmem = 4096 16384 536870912
net.ipv4.tcp_adv_win_scale = -2

A tcp_rmem of 512MiB and tcp_adv_win_scale of -2 results in a maximum window size that autotuning can set of 128 MiB, our desired value.

Disabling TCP collapse

Patient: Doctor, it hurts when we collapse the TCP receive queue.

Doctor: Then don’t do that!

Generally speaking, when a packet arrives at a buffer when the buffer is full, the packet gets dropped. In the case of these receive buffers, Linux tries to “save the packet” when the buffer is full by collapsing the receive queue. Frequently this is successful, but it is not guaranteed to be, and it takes time.

There are no problems created by immediately just dropping the packet instead of trying to save it. The receive queue is full anyway, so the local receiver application still has data to read. The sender’s congestion control will notice the drop and/or ZeroWindow and will respond appropriately. Everything will continue working as designed.

At present, there is no setting provided by Linux to disable the TCP collapse. We developed an in-house patch to the kernel to disable the TCP collapse logic.

Kernel patch – Attempt #1

The kernel patch for our first attempt was straightforward. At the top of tcp_try_rmem_schedule(), if the memory allocation fails, we simply return (after pred_flag = 0 and tcp_sack_reset()), thus completely skipping the tcp_collapse and related logic.

It didn’t work.

Although we eliminated the latency spikes while using large buffer limits, we did not observe the throughput we expected.

One of the realizations we made as we investigated the situation was that standard network benchmarking tools such as iperf3 and similar do not expose the problem we are trying to solve. iperf3 does not fill the receive queue. Linux autotuning does not open the TCP window large enough. Autotuning is working perfectly for our well-behaved benchmarking program.

We need application-layer software that is slightly less well-behaved, one that exercises the autotuning logic under test. So we wrote one.

A new benchmarking tool

Anomalies were seen during our “Attempt #1” that negatively impacted throughput. The anomalies were seen only under certain specific conditions, and we realized we needed a better benchmarking tool to detect and measure the performance impact of those anomalies.

This tool has turned into an invaluable resource during the development of this patch and raised confidence in our solution.

It consists of two Python programs. The reader opens a TCP session to the daemon, at which point the daemon starts sending user payload as fast as it can, and never stops sending.

The reader, on the other hand, starts and stops reading in a way to open up the TCP receive window wide open and then repeatedly causes the buffers to fill up completely. More specifically, the reader implemented this logic:

reads as fast as it can, for five seconds
- this is called fast mode
- opens up the window
calculates 5% of the high watermark of the bytes reader during any previous one second
for each second of the next 15 seconds:
- this is called slow mode
- reads that 5% number of bytes, then stops reading
- sleeps for the remainder of that particular second
- most of the second consists of no reading at all
steps 1-3 are repeated in a loop three times, so the entire run is 60 seconds

This has the effect of highlighting any issues in the handling of packets when the buffers repeatedly hit the limit.

Revisiting default Linux behavior

Taking a step back, let’s look at the default Linux behavior. The following is kernel v5.15.16.

The Linux kernel is effective at freeing up space in order to make room for incoming packets when the receive buffer memory limit is hit. As documented previously, the cost for saving these packets (i.e. not dropping them) is latency.

However, the latency spikes, in milliseconds, for tcp_try_rmem_schedule(), are:

tcp_rmem 170 MiB, tcp_adv_win_scale +2 (170p2):

@ms:
[0]       27093 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[1]           0 |
[2, 4)        0 |
[4, 8)        0 |
[8, 16)       0 |
[16, 32)      0 |
[32, 64)     16 |

tcp_rmem 146 MiB, tcp_adv_win_scale +3 (146p3):

@ms:
(..., 16)  25984 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[16, 20)       0 |
[20, 24)       0 |
[24, 28)       0 |
[28, 32)       0 |
[32, 36)       0 |
[36, 40)       0 |
[40, 44)       1 |
[44, 48)       6 |
[48, 52)       6 |
[52, 56)       3 |

tcp_rmem 137 MiB, tcp_adv_win_scale +4 (137p4):

@ms:
(..., 16)  37222 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[16, 20)       0 |
[20, 24)       0 |
[24, 28)       0 |
[28, 32)       0 |
[32, 36)       0 |
[36, 40)       1 |
[40, 44)       8 |
[44, 48)       2 |

These are the latency spikes we cannot have on the Cloudflare global network.

Kernel patch – Attempt #2

So the “something” that was not working in Attempt #1 was that the receive queue memory limit was hit early on as the flow was just ramping up (when the values for sk_rmem_alloc and sk_rcvbuf were small, ~800KB). This occurred at about the two second mark for 137p4 test (about 2.25 seconds for 170p2).

In hindsight, we should have noticed that tcp_prune_queue() actually raises sk_rcvbuf when it can. So we modified the patch in response to that, added a guard to allow the collapse to execute when sk_rmem_alloc is less than the threshold value.

net.ipv4.tcp_collapse_max_bytes = 6291456

The next section discusses how we arrived at this value for tcp_collapse_max_bytes.

The patch is available here.

The results with the new patch are as follows:

oscil – 300ms tests

oscil – 20ms tests

oscil – 0ms tests

iperf3 – 300 ms tests

iperf3 – 20 ms tests

iperf3 – 0ms tests

All tests are successful.

Setting tcp_collapse_max_bytes

In order to determine this setting, we need to understand what the biggest queue we can collapse without incurring unacceptable latency.

Using 6 MiB should result in a maximum latency of no more than 2 ms.

Cloudflare production network results

Current production settings (“Old”)

net.ipv4.tcp_rmem = 8192 2097152 16777216
net.ipv4.tcp_wmem = 4096 16384 33554432
net.ipv4.tcp_adv_win_scale = -2
net.ipv4.tcp_collapse_max_bytes = 0
net.ipv4.tcp_notsent_lowat = 4294967295

tcp_collapse_max_bytes of 0 means that the custom feature is disabled and that the vanilla kernel logic is used for TCP collapse processing.

New settings under test (“New”)

net.ipv4.tcp_rmem = 8192 262144 536870912
net.ipv4.tcp_wmem = 4096 16384 536870912
net.ipv4.tcp_adv_win_scale = -2
net.ipv4.tcp_collapse_max_bytes = 6291456
net.ipv4.tcp_notsent_lowat = 131072

The tcp_notsent_lowat setting is discussed in the last section of this post.

The middle value of tcp_rmem was changed as a result of separate work that found that Linux autotuning was setting receive buffers too high for localhost sessions. This updated setting reduces TCP memory usage for those sessions, but does not change anything about the type of TCP sessions that is the focus of this post.

For the following benchmarks, we used non-Cloudflare host machines in Iowa, US, and Melbourne, Australia performing data transfers to the Cloudflare data center in Marseille, France. In Marseille, we have some hosts configured with the existing production settings, and others with the system settings described in this post. Software used is perf3 version 3.9, kernel 5.15.32.

Throughput results

	RTT (ms)	Throughput with Current Settings (mbps)	Throughput with New Settings (mbps)	Increase Factor
Iowa to Marseille	121	276	6600	24x
Melbourne to Marseille	282	120	3800	32x

Iowa-Marseille throughput

Iowa-Marseille receive window and bytes-in-flight

Melbourne-Marseille throughput

Melbourne-Marseille receive window and bytes-in-flight

Even with the new settings in place, the Melbourne to Marseille performance is limited by the receive window on the Cloudflare host. This means that further adjustments to these settings yield even higher throughput.

Latency results

The Y-axis on these charts are the 99th percentile time for TCP collapse in seconds.

Cloudflare hosts in Marseille running the current production settings

Cloudflare hosts in Marseille running the new settings

The takeaway in looking at these graphs is that maximum TCP collapse time for the new settings is no worse than with the current production settings. This is the desired result.

Send Buffers

What we have shown so far is that the receiver side seems to be working well, but what about the sender side?

As part of this work, we are setting tcp_wmem max to 512 MiB. For oscillating reader flows, this can cause the send buffer to become quite large. This represents bufferbloat and wasted kernel memory, both things that nobody likes or wants.

Fortunately, there is already a solution: tcp_notsent_lowat. This setting limits the size of unsent bytes in the write queue. More details can be found at https://lwn.net/Articles/560082.

The results are significant:

The RTT for these tests was 466ms. Throughput is not negatively affected. Throughput is at full wire speed in all cases (1 Gbps). Memory usage is as reported by /proc/net/sockstat, TCP mem.

Our web servers already set tcp_notsent_lowat to 131072 for its sockets. All other senders are using 4 GiB, the default value. We are changing the sysctl so that 131072 is in effect for all senders running on the server.

Conclusion

The goal of this work is to open the throughput floodgates for high BDP connections while simultaneously ensuring very low HTTP request latency.

We have accomplished that goal.

Ask HN: How do you ensure everyone on the team is heard on Slack?

2022-07-01T06:17:58.000Z

Ask HN: Where and how do you find your early adoptors?

2022-06-30T11:07:13.000Z

Nixpacks beat Buildpacks by almost two minutes

2022-06-30T05:44:22.000Z

Improve Git monorepo performance with a file system monitor

2022-06-29T19:20:23.000Z

Ask HN: What tools are you a 10/10 on?

2022-06-28T22:48:04.000Z

Ask HN: What is your Kubernetes nightmare?

2022-06-27T09:39:57.000Z

picosnitch: Monitor Linux network traffic per executable using BPF

2022-06-25T19:41:28.000Z

Pure: A static analysis file format checker

2022-06-25T06:15:06.000Z

Ask HN: How to level up your technical writing?

2022-06-24T08:23:29.000Z

How We built a $1M ARR open source SaaS

2022-06-24T01:54:50.000Z

Whatever happened to SHA-256 support in Git?

2022-06-23T16:47:45.000Z

Sometimes users prefer the less straightforward UX

2022-06-22T23:41:52.000Z

Infrastructure SaaS – a control plane first architecture

2022-06-22T17:42:49.000Z

Tailscale SSH

2022-06-22T15:14:32.000Z

Stack Overflow Developer Survey 2022

2022-06-22T15:05:46.000Z

Guide to Web Authentication

2022-06-22T15:02:08.000Z

I Fucking Hate Jira

2022-06-20T18:40:57.000Z

Sshfs Is Orphaned

2022-06-20T16:45:33.000Z

SSO should be table stakes

2022-06-20T14:34:51.000Z

The State of WebAssembly 2022

2022-06-20T09:42:25.000Z

Don't use Kubernetes yet

2022-06-19T00:49:39.000Z

Cold Showers

2022-06-18T19:34:33.000Z

Ask HN: Best Dev Tool pitches of all time?

2022-06-17T18:18:41.000Z

Ask HN: Best dev tool pitches of all time?

2022-06-17T18:18:41.000Z

CockroachDB's Consistency Model

2022-06-16T20:30:14.000Z

Using web streams on Node.js

2022-06-16T00:00:00.000Z

Web streams are a standard for streams that is now supported on all major web platforms: web browsers, Node.js, and Deno. (Streams are an abstraction for reading and writing data sequentially in small pieces from all kinds of sources – files, data hosted on servers, etc.)

For example, the global function fetch() (which downloads online resources) asynchronously returns a Response which has a property .body with a web stream.

This blog post covers web streams on Node.js, but most of what we learn applies to all web platforms that support them.

[Read rest of post]

Select ’Hello, World’: Serverless Postgres Built for the Cloud

2022-06-15T14:23:52.000Z

Fallout 5 Is Bethesda’s Next Game After Elder Scrolls 6, Will Probably Be Out By 2050

2022-06-14T22:17:52.000Z

In an interview published today, Bethesda’s creative director Todd Howard revealed the studio’s future plans, explaining that after the Elder Scrolls 6 comes out, the studio’s next game will be Fallout 5, the next main entry in the company’s post-apocalyptic open-world RPG franchise. However, considering how long it…

Heroku April 2022 Incident Review

2022-06-14T17:43:20.000Z

The Developer's Guide to SaaS Compliance

2022-06-13T17:32:15.000Z

Fresh – Next-gen web framework

2022-06-13T01:50:22.000Z

Vetting the Cargo

2022-06-13T00:16:48.000Z

Best Practices for Inclusive CLIs

2022-06-11T23:01:11.000Z

Gitsign

2022-06-09T20:36:58.000Z

The End of Localhost

2022-06-08T16:26:08.000Z

Dhall: A Gateway Drug to Haskell

2022-06-07T18:59:15.000Z

UX patterns for CLI tools

2022-06-06T08:27:14.000Z

Painless Desktop containers for everyday development

2022-06-04T21:07:48.000Z

Docker is dead? Podman – an alternative tool?

2022-06-04T00:22:17.000Z

Tell HN: Cloudflare prevents transfer-out of domains, sets to 'pendingdelete'

2022-05-31T23:53:23.000Z

SomaFM

2022-05-31T17:08:18.000Z

Ask HN: Docker vs simple DLLs?

2022-05-31T11:02:51.000Z

Dragonflydb – A modern replacement for Redis and Memcached

2022-05-30T16:18:35.000Z

The Story of Heroku

2022-05-30T14:22:15.000Z

This Is As Close To A Real-World Escape Room As You'll Find

2022-05-30T13:30:00.000Z

I have found my happy place! Escape Simulator is such a lovely thing, a first-person simulacrum of escape rooms, built in 3D, with realistic physics. It is, as its title suggests, a simulation of attending a real-world escape room, in a way that almost all room-escape video games are not. Apart from when it’s in space.

Everything that makes working with databases easier

2022-05-29T15:31:20.000Z

System Font Stack

2022-05-28T19:36:14.000Z

Are all popular APIs moving to Cursor based pagination?

2022-05-28T15:31:12.000Z

FoundationDB Time Series Layer: Millions of writes/s in 2k lines of Go (2019)

2022-05-28T14:19:16.000Z

Neon – Serverless Postgres

2022-05-28T01:37:10.000Z

Ask HN: How can you buy high quality healthcare?

2022-05-27T14:38:18.000Z

A Linux 5.10 patch has caused user-space regressions

2022-05-26T16:48:47.000Z

Learnings from 5 years of tech startup code audits

2022-05-26T07:37:38.000Z

Magical SVG Techniques

2022-05-25T09:20:38.000Z

Fly Machines: An API for Fast-Booting VMs

2022-05-24T19:48:33.000Z

Show HN: Linen – Make your Discord community Google-searchable

2022-05-24T17:19:17.000Z

Flightcontrol (YC W22) Is hiring a back end TypeScript engineer [Own-cloud PaaS]

2022-05-22T12:00:26.000Z

Ask HN: What developer tools would you like to see?

2022-05-22T11:06:47.000Z

Fd: A simple, fast and user-friendly alternative to 'find'

2022-05-20T14:44:17.000Z

Show HN: HelloInbox – Ultimate email deliverability checklist and toolkit

2022-05-19T07:40:36.000Z

Unfinished Business with Postgres

2022-05-18T17:04:44.000Z

Why billing systems are a nightmare for engineers

2022-05-18T16:06:44.000Z

Husky, Datadog's Third-Generation Event Store

2022-05-17T22:04:41.000Z

My thoughts about Fly.io (so far) and other newish technology I'm getting into

2022-05-17T16:59:51.000Z

Request logging and web vitals for Vercel apps

2022-05-16T20:45:08.000Z

Supercharging GitHub Actions with Job Summaries

2022-05-16T18:28:06.000Z

How we deploy to production over 100 times a day

2022-05-16T09:21:44.000Z

Heroku: Core Impact

2022-05-15T21:22:13.000Z

Fly.io: The reclaimer of Heroku's magic

2022-05-15T19:56:39.000Z

Why Did Heroku Fail?

2022-05-13T21:22:52.000Z

Show HN: I built a service to help companies reduce AWS spend by 50%

2022-05-12T17:29:02.000Z

Apple Maps location scan spikes WiFi latency every 60 seconds

2022-05-12T16:56:40.000Z

Benchmarking Container Scaling on AWS

2022-05-12T10:43:25.000Z

Eurovision Diaries

2022-05-10T19:24:02.000Z

Separate Your Billing from Entitlements

2022-05-10T08:00:21.000Z

Introducing Crane: Composable and Cacheable Builds with Cargo and Nix

2022-05-06T21:40:51.000Z

Apple Intentionally Disabled HiDPI on M1 Macs to Push 4K Monitors Sales

2022-05-06T17:39:08.000Z

Startup Trail

2022-05-05T11:06:33.000Z

Zero downtime migrations

2022-05-05T04:33:53.000Z

Firecracker: Lightweight Virtualization for Serverless Applications (2020)

2022-05-04T15:16:27.000Z

Postmark acquired by marketing firm ActiveCampaign

2022-05-03T12:07:19.000Z

Postmark has been acquired by ActiveCampaign

2022-05-03T12:07:19.000Z

Designing a Command Palette

2022-05-03T04:29:13.000Z

Project Loom preview ships in JDK 19

2022-05-02T15:58:35.000Z

Fixing a MongoDB Replication Protocol Bug with TLA+

2022-05-02T15:33:45.000Z

Brendan at Intel.com

2022-05-02T15:12:08.000Z

Ask HN: What are your favorite examples of elegant software?

2022-05-02T02:47:42.000Z

I won free load testing

2022-05-02T01:27:25.000Z

GitBOM: Enabling universal artifact traceability in software supply chains

2022-04-30T18:17:18.000Z

Show HN: Recipes, Not Mommy Blogs

2022-04-29T04:41:50.000Z

wharf: A protocol to quickly transfer software builds (reference Go implementation)

2022-04-29T00:03:55.000Z

Why companies move off Heroku (besides the cost)

2022-04-27T18:06:03.000Z

Ask HN: Do you find it challenging to talk to your users?

2022-04-27T11:22:55.000Z

Non-interactive Elements with the inert attribute

2022-04-26T20:19:23.000Z

Effective Go

2022-04-26T14:31:48.000Z

Nix flakes, and how to convert to them

2022-04-25T18:34:09.000Z

Free Open Source Tailwind CSS Components

2022-04-25T16:04:54.000Z

Show HN: M3O – Universal Public API Interface

2022-04-25T10:09:25.000Z

Ask HN: Have you used SQLite as a primary database?

2022-04-25T10:06:05.000Z

Crimes with Go Generics

2022-04-25T01:07:06.000Z

TUI in webapp design language(CSS) and pattern(check the demo, it’s next level)

2022-04-24T12:15:36.000Z

How Nix and NixOS Get So Close to Perfect

2022-04-24T04:49:11.000Z

TypeScript and Set Theory

2022-04-22T00:51:11.000Z

Nixery – Docker images on the fly with Nix

2022-04-19T02:13:11.000Z

Ask HN: I burned out but I don't want to let my team down

2022-04-18T15:59:11.000Z

The Missing Kubernetes Type System

2022-04-17T17:45:48.000Z

Heroku Security Incident

2022-04-16T02:20:10.000Z

Migrating from SQLite to PostgreSQL

2022-04-15T10:42:31.000Z

Cultural Anorexia: The Pursuit of Thicker Desires in a Thinning World

2022-04-15T08:15:51.000Z

Facebook open sources Lexical, an extensible text editor library

2022-04-13T20:22:51.000Z

The most popular chess streamer on Twitch

2022-04-13T12:27:32.000Z

Tokyo’s Manuscript Writing Cafe won't let writers leave until they are finished

2022-04-13T09:32:35.000Z

A list of new(ish) command line tools – Julia Evans

2022-04-12T23:08:52.000Z

Protect domains that do not send email

2022-04-11T13:27:10.000Z

Ask HN: Editing remote code locally: Best practices?

2022-04-11T12:28:15.000Z

Ask HN: Have you had any real benefits from apps like Headspace, Fabulous, etc.?

2022-04-10T14:01:43.000Z

Mutagen – Cloud-based development using your local tools

2022-04-08T14:24:27.000Z

Things we did not do while reaching $2M ARR

2022-04-06T18:27:13.000Z

We don’t use a staging environment

2022-04-03T18:28:42.000Z

A Database for 2022

2022-04-01T20:33:11.000Z

A database for 2022

2022-04-01T20:33:11.000Z

Show HN: gh-dash – GitHub CLI dashboard for pull requests and issues

2022-04-01T18:12:14.000Z

Why we chose NanoIDs for PlanetScale's API

2022-04-01T18:00:44.000Z

How we secure Monzo's banking platform

2022-04-01T09:36:37.000Z

How I operated as a Staff engineer at Heroku

2022-03-31T17:47:00.000Z

Dagger: a new way to build CI/CD pipelines

2022-03-30T16:00:10.000Z

React v18.0

2022-03-29T16:09:50.000Z

A magical AWS serverless developer experience

2022-03-28T04:05:13.000Z

Show HN: Wachy – A UI for eBPF-based performance debugging

2022-03-26T15:59:23.000Z

`COPY –chmod` reduced the size of my container image by 35%

2022-03-26T03:11:58.000Z

Master Styles – The First Virtual CSS Engine

2022-03-24T20:10:43.000Z

Show HN: Postgres.js – Fastest Full-Featured PostgreSQL Client for Node and Deno

2022-03-24T19:30:03.000Z

It’s fine, Rewind: Revert a migration without losing data

2022-03-24T12:16:03.000Z

Tea – the toolkit that builds the Internet

2022-03-23T14:42:01.000Z

Dum: An NPM scripts runner written in Rust

2022-03-23T13:39:19.000Z

HTMLInputElement ShowPicker()

2022-03-22T10:46:59.000Z

Windows needs a change in priorities

2022-03-20T22:24:22.000Z

eBPF Nuances on Minikube

2022-03-18T12:28:08.000Z

How to make Docker images even smaller

2022-03-16T19:51:55.000Z

Speed boost achievement unlocked on Docker Desktop 4.6 for Mac

2022-03-16T18:00:35.000Z

How our free plan stays free

2022-03-16T17:14:49.000Z

How to Learn Nix

2022-03-15T02:45:47.000Z

Solito – React Native and Next.js unified

2022-03-14T20:51:15.000Z

Show HN: HN Avatars in 357 bytes

2022-03-14T03:08:25.000Z

The Future of Kubernetes

2022-03-13T12:09:05.000Z

The Legal Implications of Remote Working Cross-Border

2022-03-12T15:50:03.000Z

Mastering the Docker Cache

2022-03-10T18:04:22.000Z

My Experience Working and Living in China

2022-03-10T08:31:25.000Z

Show HN: World’s first £3 flat fee (0% FX markup) money transfer service

2022-03-09T07:58:30.000Z

Show HN: Berkeley Mono Typeface

2022-03-04T17:21:25.000Z

How much do founders pay themselves? A European data set

2022-03-01T14:01:49.000Z

Ask HN: Books you should read when you transform from SWE into SWE-Management

2022-02-28T08:37:30.000Z

OrioleDB – solving some PostgreSQL wicked problems

2022-02-25T01:31:06.000Z

How Bytebase uses Render – to improve dev workflow

2022-02-24T09:24:58.000Z

The Buf CLI, an all-in-one tool for Protobuf development, has reached v1.0

2022-02-23T16:30:46.000Z

The Best Gear In Horizon Forbidden West (And Where To Find It)

2022-02-22T21:20:00.000Z

Surviving in Horizon Forbidden West, an open-world game about what happens when Elon Musk funds one too many projects, is only partially contingent on skill. But if you want to thrive in the robo-dino apocalypse, you’ll need to equip yourself with the best gear around.

What I learned running a SaaS for a year

2022-02-22T10:30:19.000Z

How we use Notion as a startup

2022-02-18T23:54:25.000Z

FBI sounds alarm as QR code usage soars

2022-02-18T07:32:42.000Z

Production ready eBPF, or how we fixed the BSD socket API

2022-02-17T17:02:54.000Z

As we develop new products, we often push our operating system - Linux - beyond what is commonly possible. A common theme has been relying on eBPF to build technology that would otherwise have required modifying the kernel. For example, we’ve built DDoS mitigation and a load balancer and use it to monitor our fleet of servers.

This software usually consists of a small-ish eBPF program written in C, executed in the context of the kernel, and a larger user space component that loads the eBPF into the kernel and manages its lifecycle. We’ve found that the ratio of eBPF code to userspace code differs by an order of magnitude or more. We want to shed some light on the issues that a developer has to tackle when dealing with eBPF and present our solutions for building rock-solid production ready applications which contain eBPF.

For this purpose we are open sourcing the production tooling we’ve built for the sk_lookup hook we contributed to the Linux kernel, called tubular. It exists because we’ve outgrown the BSD sockets API. To deliver some products we need features that are just not possible using the standard API.

Our services are available on millions of IPs.
Multiple services using the same port on different addresses have to coexist, e.g. 1.1.1.1 resolver and our authoritative DNS.
Our Spectrum product needs to listen on all 2^16 ports.

The source code for tubular is at github.com/cloudflare/tubular, and it allows you to do all the things mentioned above. Maybe the most interesting feature is that you can change the addresses of a service on the fly:

How tubular works

tubular sits at a critical point in the Cloudflare stack, since it has to inspect every connection terminated by a server and decide which application should receive it.

Failure to do so will drop or misdirect connections hundreds of times per second. So it has to be incredibly robust during day to day operations. We had the following goals for tubular:

Releases must be unattended and happen online. tubular runs on thousands of machines, so we can’t babysit the process or take servers out of production.
Releases must fail safely. A failure in the process must leave the previous version of tubular running, otherwise we may drop connections.
Reduce the impact of (userspace) crashes. When the inevitable bug comes along we want to minimise the blast radius.

In the past we had built a proof-of-concept control plane for sk_lookup called inet-tool, which proved that we could get away without a persistent service managing the eBPF. Similarly, tubular has tubectl: short-lived invocations make the necessary changes and persisting state is handled by the kernel in the form of eBPF maps. Following this design gave us crash resiliency by default, but left us with the task of mapping the user interface we wanted to the tools available in the eBPF ecosystem.

The tubular user interface

tubular consists of a BPF program that attaches to the sk_lookup hook in the kernel and userspace Go code which manages the BPF program. The tubectl command wraps both in a way that is easy to distribute.

tubectl manages two kinds of objects: bindings and sockets. A binding encodes a rule against which an incoming packet is matched. A socket is a reference to a TCP or UDP socket that can accept new connections or packets.

Bindings and sockets are "glued" together via arbitrary strings called labels. Conceptually, a binding assigns a label to some traffic. The label is then used to find the correct socket.

Adding bindings

To create a binding that steers port 80 (aka HTTP) traffic destined for 127.0.0.1 to the label “foo” we use tubectl bind:

$ sudo tubectl bind "foo" tcp 127.0.0.1 80

Due to the power of sk_lookup we can have much more powerful constructs than the BSD API. For example, we can redirect connections to all IPs in 127.0.0.0/24 to a single socket:

$ sudo tubectl bind "bar" tcp 127.0.0.0/24 80

A side effect of this power is that it's possible to create bindings that "overlap":

1: tcp 127.0.0.1/32 80 -> "foo"
2: tcp 127.0.0.0/24 80 -> "bar"

The first binding says that HTTP traffic to localhost should go to “foo”, while the second asserts that HTTP traffic in the localhost subnet should go to “bar”. This creates a contradiction, which binding should we choose? tubular resolves this by defining precedence rules for bindings:

A prefix with a longer mask is more specific, e.g. 127.0.0.1/32 wins over 127.0.0.0/24.
A port is more specific than the port wildcard, e.g. port 80 wins over "all ports" (0).

Applying this to our example, HTTP traffic to all IPs in 127.0.0.0/24 will be directed to foo, except for 127.0.0.1 which goes to bar.

Getting ahold of sockets

sk_lookup needs a reference to a TCP or a UDP socket to redirect traffic to it. However, a socket is usually accessible only by the process which created it with the socket syscall. For example, an HTTP server creates a TCP listening socket bound to port 80. How can we gain access to the listening socket?

A fairly well known solution is to make processes cooperate by passing socket file descriptors via SCM_RIGHTS messages to a tubular daemon. That daemon can then take the necessary steps to hook up the socket with sk_lookup. This approach has several drawbacks:

Requires modifying processes to send SCM_RIGHTS
Requires a tubular daemon, which may crash

There is another way of getting at sockets by using systemd, provided socket activation is used. It works by creating an additional service unit with the correct Sockets setting. In other words: we can leverage systemd oneshot action executed on creation of a systemd socket service, registering the socket into tubular. For example:

[Unit]
Requisite=foo.socket

[Service]
Type=oneshot
Sockets=foo.socket
ExecStart=tubectl register "foo"

Since we can rely on systemd to execute tubectl at the correct times we don't need a daemon of any kind. However, the reality is that a lot of popular software doesn't use systemd socket activation. Dealing with systemd sockets is complicated and doesn't invite experimentation. Which brings us to the final trick: pidfd_getfd:

The pidfd_getfd() system call allocates a new file descriptor in the calling process. This new file descriptor is a duplicate of an existing file descriptor, targetfd, in the process referred to by the PID file descriptor pidfd.

We can use it to iterate all file descriptors of a foreign process, and pick the socket we are interested in. To return to our example, we can use the following command to find the TCP socket bound to 127.0.0.1 port 8080 in the httpd process and register it under the "foo" label:

$ sudo tubectl register-pid "foo" $(pidof httpd) tcp 127.0.0.1 8080

It's easy to wire this up using systemd's ExecStartPost if the need arises.

[Service]
Type=forking # or notify
ExecStart=/path/to/some/command
ExecStartPost=tubectl register-pid $MAINPID foo tcp 127.0.0.1 8080

Storing state in eBPF maps

As mentioned previously, tubular relies on the kernel to store state, using BPF key / value data structures also known as maps. Using the BPF_OBJ_PIN syscall we can persist them in /sys/fs/bpf:

/sys/fs/bpf/4026532024_dispatcher
├── bindings
├── destination_metrics
├── destinations
├── sockets
└── ...

The way the state is structured differs from how the command line interface presents it to users. Labels like “foo” are convenient for humans, but they are of variable length. Dealing with variable length data in BPF is cumbersome and slow, so the BPF program never references labels at all. Instead, the user space code allocates numeric IDs, which are then used in the BPF. Each ID represents a (label, domain, protocol) tuple, internally called destination.

For example, adding a binding for "foo" tcp 127.0.0.1 ... allocates an ID for ("foo", AF_INET, TCP). Including domain and protocol in the destination allows simpler data structures in the BPF. Each allocation also tracks how many bindings reference a destination so that we can recycle unused IDs. This data is persisted into the destinations hash table, which is keyed by (Label, Domain, Protocol) and contains (ID, Count). Metrics for each destination are tracked in destination_metrics in the form of per-CPU counters.

bindings is a longest prefix match (LPM) trie which stores a mapping from (protocol, port, prefix) to (ID, prefix length). The ID is used as a key to the sockets map which contains pointers to kernel socket structures. IDs are allocated in a way that makes them suitable as an array index, which allows using the simpler BPF sockmap (an array) instead of a socket hash table. The prefix length is duplicated in the value to work around shortcomings in the BPF API.

Encoding the precedence of bindings

As discussed, bindings have a precedence associated with them. To repeat the earlier example:

1: tcp 127.0.0.1/32 80 -> "foo"
2: tcp 127.0.0.0/24 80 -> "bar"

The first binding should be matched before the second one. We need to encode this in the BPF somehow. One idea is to generate some code that executes the bindings in order of specificity, a technique we’ve used to great effect in l4drop:

1: if (mask(ip, 32) == 127.0.0.1) return "foo"
2: if (mask(ip, 24) == 127.0.0.0) return "bar"
...

This has the downside that the program gets longer the more bindings are added, which slows down execution. It's also difficult to introspect and debug such long programs. Instead, we use a specialised BPF longest prefix match (LPM) map to do the hard work. This allows inspecting the contents from user space to figure out which bindings are active, which is very difficult if we had compiled bindings into BPF. The LPM map uses a trie behind the scenes, so lookup has complexity proportional to the length of the key instead of linear complexity for the “naive” solution.

However, using a map requires a trick for encoding the precedence of bindings into a key that we can look up. Here is a simplified version of this encoding, which ignores IPv6 and uses labels instead of IDs. To insert the binding tcp 127.0.0.0/24 80 into a trie we first convert the IP address into a number.

127.0.0.0    = 0x7f 00 00 00

Since we're only interested in the first 24 bits of the address we, can write the whole prefix as

127.0.0.0/24 = 0x7f 00 00 ??

where “?” means that the value is not specified. We choose the number 0x01 to represent TCP and prepend it and the port number (80 decimal is 0x50 hex) to create the full key:

tcp 127.0.0.0/24 80 = 0x01 50 7f 00 00 ??

Converting tcp 127.0.0.1/32 80 happens in exactly the same way. Once the converted values are inserted into the trie, the LPM trie conceptually contains the following keys and values.

LPM trie:
        0x01 50 7f 00 00 ?? = "bar"
        0x01 50 7f 00 00 01 = "foo"

To find the binding for a TCP packet destined for 127.0.0.1:80, we again encode a key and perform a lookup.

input:  0x01 50 7f 00 00 01   TCP packet to 127.0.0.1:80
---------------------------
LPM trie:
        0x01 50 7f 00 00 ?? = "bar"
           y  y  y  y  y
        0x01 50 7f 00 00 01 = "foo"
           y  y  y  y  y  y
---------------------------
result: "foo"

y = byte matches

The trie returns “foo” since its key shares the longest prefix with the input. Note that we stop comparing keys once we reach unspecified “?” bytes, but conceptually “bar” is still a valid result. The distinction becomes clear when looking up the binding for a TCP packet to 127.0.0.255:80.

input:  0x01 50 7f 00 00 ff   TCP packet to 127.0.0.255:80
---------------------------
LPM trie:
        0x01 50 7f 00 00 ?? = "bar"
           y  y  y  y  y
        0x01 50 7f 00 00 01 = "foo"
           y  y  y  y  y  n
---------------------------
result: "bar"

n = byte doesn't match

In this case "foo" is discarded since the last byte doesn't match the input. However, "bar" is returned since its last byte is unspecified and therefore considered to be a valid match.

Observability with minimal privileges

Linux has the powerful ss tool (part of iproute2) available to inspect socket state:

$ ss -tl src 127.0.0.1
State      Recv-Q      Send-Q           Local Address:Port           Peer Address:Port
LISTEN     0           128                  127.0.0.1:ipp                 0.0.0.0:*

With tubular in the picture this output is not accurate anymore. tubectl bindings makes up for this shortcoming:

$ sudo tubectl bindings tcp 127.0.0.1
Bindings:
 protocol       prefix port label
      tcp 127.0.0.1/32   80   foo

Running this command requires super-user privileges, despite in theory being safe for any user to run. While this is acceptable for casual inspection by a human operator, it's a dealbreaker for observability via pull-based monitoring systems like Prometheus. The usual approach is to expose metrics via an HTTP server, which would have to run with elevated privileges and be accessible to the Prometheus server somehow. Instead, BPF gives us the tools to enable read-only access to tubular state with minimal privileges.

The key is to carefully set file ownership and mode for state in /sys/fs/bpf. Creating and opening files in /sys/fs/bpf uses BPF_OBJ_PIN and BPF_OBJ_GET. Calling BPF_OBJ_GET with BPF_F_RDONLY is roughly equivalent to open(O_RDONLY) and allows accessing state in a read-only fashion, provided the file permissions are correct. tubular gives the owner full access but restricts read-only access to the group:

$ sudo ls -l /sys/fs/bpf/4026532024_dispatcher | head -n 3
total 0
-rw-r----- 1 root root 0 Feb  2 13:19 bindings
-rw-r----- 1 root root 0 Feb  2 13:19 destination_metrics

It's easy to choose which user and group should own state when loading tubular:

$ sudo -u root -g tubular tubectl load
created dispatcher in /sys/fs/bpf/4026532024_dispatcher
loaded dispatcher into /proc/self/ns/net
$ sudo ls -l /sys/fs/bpf/4026532024_dispatcher | head -n 3
total 0
-rw-r----- 1 root tubular 0 Feb  2 13:42 bindings
-rw-r----- 1 root tubular 0 Feb  2 13:42 destination_metrics

There is one more obstacle, systemd mounts /sys/fs/bpf in a way that makes it inaccessible to anyone but root. Adding the executable bit to the directory fixes this.

$ sudo chmod -v o+x /sys/fs/bpf
mode of '/sys/fs/bpf' changed from 0700 (rwx------) to 0701 (rwx-----x)

Finally, we can export metrics without privileges:

$ sudo -u nobody -g tubular tubectl metrics 127.0.0.1 8080
Listening on 127.0.0.1:8080
^C

There is a caveat, unfortunately: truly unprivileged access requires unprivileged BPF to be enabled. Many distros have taken to disabling it via the unprivileged_bpf_disabled sysctl, in which case scraping metrics does require CAP_BPF.

Safe releases

tubular is distributed as a single binary, but really consists of two pieces of code with widely differing lifetimes. The BPF program is loaded into the kernel once and then may be active for weeks or months, until it is explicitly replaced. In fact, a reference to the program (and link, see below) is persisted into /sys/fs/bpf:

/sys/fs/bpf/4026532024_dispatcher
├── link
├── program
└── ...

The user space code is executed for seconds at a time and is replaced whenever the binary on disk changes. This means that user space has to be able to deal with an "old" BPF program in the kernel somehow. The simplest way to achieve this is to compare what is loaded into the kernel with the BPF shipped as part of tubectl. If the two don't match we return an error:

$ sudo tubectl bind foo tcp 127.0.0.1 80
Error: bind: can't open dispatcher: loaded program #158 has differing tag: "938c70b5a8956ff2" doesn't match "e007bfbbf37171f0"

tag is the truncated hash of the instructions making up a BPF program, which the kernel makes available for every loaded program:

$ sudo bpftool prog list id 158
158: sk_lookup  name dispatcher  tag 938c70b5a8956ff2
...

By comparing the tag tubular asserts that it is dealing with a supported version of the BPF program. Of course, just returning an error isn't enough. There needs to be a way to update the kernel program so that it's once again safe to make changes. This is where the persisted link in /sys/fs/bpf comes into play. bpf_links are used to attach programs to various BPF hooks. "Enabling" a BPF program is a two-step process: first, load the BPF program, next attach it to a hook using a bpf_link. Afterwards the program will execute the next time the hook is executed. By updating the link we can change the program on the fly, in an atomic manner.

$ sudo tubectl upgrade
Upgraded dispatcher to 2022.1.0-dev, program ID #159
$ sudo bpftool prog list id 159
159: sk_lookup  name dispatcher  tag e007bfbbf37171f0
…
$ sudo tubectl bind foo tcp 127.0.0.1 80
bound foo#tcp:[127.0.0.1/32]:80

Behind the scenes the upgrade procedure is slightly more complicated, since we have to update the pinned program reference in addition to the link. We pin the new program into /sys/fs/bpf:

/sys/fs/bpf/4026532024_dispatcher
├── link
├── program
├── program-upgrade
└── ...

Once the link is updated we atomically rename program-upgrade to replace program. In the future we may be able to use RENAME_EXCHANGE to make upgrades even safer.

Preventing state corruption

So far we’ve completely neglected the fact that multiple invocations of tubectl could modify the state in /sys/fs/bpf at the same time. It’s very hard to reason about what would happen in this case, so in general it’s best to prevent this from ever occurring. A common solution to this is advisory file locks. Unfortunately it seems like BPF maps don't support locking.

$ sudo flock /sys/fs/bpf/4026532024_dispatcher/bindings echo works!
flock: cannot open lock file /sys/fs/bpf/4026532024_dispatcher/bindings: Input/output error

This led to a bit of head scratching on our part. Luckily it is possible to flock the directory instead of individual maps:

$ sudo flock --exclusive /sys/fs/bpf/foo echo works!
works!

Each tubectl invocation likewise invokes flock(), thereby guaranteeing that only ever a single process is making changes.

Conclusion

tubular is in production at Cloudflare today and has simplified the deployment of Spectrum and our authoritative DNS. It allowed us to leave behind limitations of the BSD socket API. However, its most powerful feature is that the addresses a service is available on can be changed on the fly. In fact, we have built tooling that automates this process across our global network. Need to listen on another million IPs on thousands of machines? No problem, it’s just an HTTP POST away.

Interested in working on tubular and our L4 load balancer unimog? We are hiring in our European offices.

Building for the 99% Developers

2022-02-17T00:19:16.000Z

An open letter to the Temporal user community

2022-02-16T20:32:50.000Z

Current MFA fatigue attack campaign targeting Microsoft Office 365 users

2022-02-16T18:17:07.000Z

1Password for SSH and Git

2022-02-16T13:01:17.000Z

Ask HN: What made your business take off that you wish you'd done much earlier?

2022-02-14T08:47:58.000Z

A Hairy PostgreSQL Incident

2022-02-11T03:22:45.000Z

Show HN: EdgeDB 1.0

2022-02-10T18:13:03.000Z

Postgres Constraints for Newbies

2022-02-09T22:25:28.000Z

Our User-Mode WireGuard Year

2022-02-09T18:06:38.000Z

Single dependency stacks

2022-02-09T16:53:29.000Z

Ask HN: What's your solution for SSL on internal servers?

2022-02-09T13:00:34.000Z

Atlas – A Database Toolkit

2022-02-08T09:00:29.000Z

Routed Gothic Font

2022-02-03T09:16:27.000Z

Why and How I Got My Own ASN

2022-02-02T15:15:23.000Z

Show HN: Just Launched an App for Dads

2022-02-02T15:09:27.000Z

Why and how I got my own ASN

2022-02-02T12:37:13.000Z

Speeding up LXC container pull by up to 3x

2022-02-01T17:34:46.000Z

Northflank – Simplifying application deployment to cloud platforms

2022-02-01T13:00:54.000Z

“Y Combinator is not worth it”

2022-01-29T19:30:49.000Z

Why do startups hire so many people?

2022-01-28T21:27:45.000Z

Ask HN: Hacker claimed ownership and then deleted my Facebook Page of 50k users

2022-01-27T16:09:19.000Z

Run Ordinary Rails Apps Globally

2022-01-26T10:19:59.000Z

Paul Giamatti broke the California wine industry

2022-01-25T02:53:04.000Z

Zlib – a spiffy yet delicately unobtrusive compression library

2022-01-24T09:00:27.000Z

Charm – tools to make the command line glamorous

2022-01-23T17:46:12.000Z

My self-hosting infrastructure, fully automated

2022-01-21T22:52:34.000Z

Atlas – Terraform but for Database Migrations

2022-01-21T07:23:27.000Z

Tricking PostgreSQL into using an insane, but faster, query plan

2022-01-18T16:42:24.000Z

PostgreSQL query performance bottlenecks

2022-01-18T11:58:13.000Z

Why I Enjoy PostgreSQL – Infrastructure Engineer's Perspective

2022-01-17T21:10:17.000Z

Real-world stories of how we’ve compromised CI/CD pipelines

2022-01-17T11:08:01.000Z

Mozilla's Firefox Relay to be added to disposable-email-domains blacklist

2022-01-16T19:28:42.000Z

Five Years a [Bootstrapped] Founder

2022-01-15T04:07:29.000Z

A Workers optimization that reduces your bill

2022-01-14T13:58:51.000Z

Recently, we made an optimization to the Cloudflare Workers runtime which reduces the amount of time Workers need to spend in memory. We're passing the savings on to you for all your Unbound Workers.

Background

Workers are often used to implement HTTP proxies, where JavaScript is used to rewrite an HTTP request before sending it on to an origin server, and then to rewrite the response before sending it back to the client. You can implement any kind of rewrite in a Worker, including both rewriting headers and bodies.

Many Workers, though, do not actually modify the response body, but instead simply allow the bytes to pass through from the origin to the client. In this case, the Worker's application code has finished executing as soon as the response headers are sent, before the body bytes have passed through. Historically, the Worker was nevertheless considered to be "in use" until the response body had fully finished streaming.

For billing purposes, under the Workers Unbound pricing model, we charge duration-memory (gigabyte-seconds) for the time in which the Worker is in use.

The change

On December 15-16, we made a change to the way we handle requests that are streaming through the response without modifying the content. This change means that we can mark application code as “idle” as soon as the response headers are returned.

Since no further application code will execute on behalf of the request, the system does not need to keep the request state in memory – it only needs to track the low-level native sockets and pump the bytes through. So now, during this time, the Worker will be considered idle, and could even be evicted before the stream completes (though this would be unlikely unless the stream lasts for a very long time).

Visualized it looks something like this:

As a result of this change, we've seen that the time a Worker is considered "in use" by any particular request has dropped by an average of 70%. Of course, this number varies a lot depending on the details of each Worker. Some may see no benefit, others may see an even larger benefit.

This change is totally invisible to the application. To any external observer, everything behaves as it did before. But, since the system now considers a Worker to be idle during response streaming, the response streaming time will no longer be billed. So, if you saw a drop in your bill, this is why!

But it doesn’t stop there!

The change also applies to a few other frequently used scenarios, namely Websocket proxying, reading from the cache and streaming from KV.

WebSockets: once a Worker has arranged to proxy through a WebSocket, as long as it isn't handling individual messages in your Worker code, the Worker does not remain in use during the proxying. The change applies to regular stateless Workers, but not to Durable Objects, which are not usually used for proxying.

export default {
  async fetch(request: Request) {
    //Do anything before
    const upgradeHeader = request.headers.get('Upgrade')
    if (upgradeHeader || upgradeHeader === 'websocket') {
      return await fetch(request)
    }
    //Or with other requests
  }
}

Reading from Cache: If you return the response from a cache.match call, the Worker is considered idle as soon as the response headers are returned.

export default {
  async fetch(request: Request) {
    let response = await caches.default.match('https://example.com')
    if (response) {
      return response
    }
    // get/create response and put into cache
  }
}

Streaming from KV: And lastly, when you stream from KV. This one is a bit trickier to get right, because often people retrieve the value from KV as a string, or JSON object and then create a response with that value. But if you fetch the value as a stream, as done in the example below, you can create a Response with the ReadableStream.

interface Env {
  MY_KV_NAME: KVNamespace
}

export default {
  async fetch(request: Request, env: Env) {
    const readableStream = await env.MY_KV_NAME.get('hello_world.pdf', { type: 'stream' })
    if (readableStream) {
      return new Response(readableStream, { headers: { 'content-type': 'application/pdf' } })
    }
  },
}

Interested in Workers Unbound?

If you are already using Unbound, your bill will have automatically dropped already.

Now is a great time to check out Unbound if you haven’t already, especially since recently, we’ve also removed the egress fees. Unbound allows you to build more complex workloads on our platform and only pay for what you use.

We are always looking for opportunities to make Workers better. Often that improvement takes the form of powerful new features such as the soon-to-be released Service Bindings and, of course, performance enhancements. This time, we are delighted to make Cloudflare Workers even cheaper than they already were.

They Still Haven't Told You

2022-01-12T20:47:38.000Z

Losing our product to button syndrome

2022-01-12T19:56:15.000Z

Mullvad: Diskless infrastructure using stboot in beta

2022-01-12T08:10:01.000Z

Ask HN: Why isn't there a backlash around charging for security features?

2022-01-11T15:20:53.000Z

Switching to a four-day workweek (2021)

2022-01-10T17:43:10.000Z

YC’s $500k Standard Deal

2022-01-10T17:36:16.000Z

Give me /events, not webhooks

2022-01-08T01:12:43.000Z

Optimizing Docker image size and why it matters

2022-01-06T19:13:01.000Z

Notes on BPF and eBPF

2022-01-02T19:48:23.000Z

Why I'm Using HTTP Basic Auth in 2022

2022-01-01T19:25:59.000Z

The Modern Guide to OAuth

2021-12-31T21:54:00.000Z

What Is Engineering Enablement

2021-12-31T18:48:52.000Z

Benthos, the Swiss Army knife of stream processing, reached 100 contributors

2021-12-31T13:10:30.000Z

Official project URL www.benthos.dev/

Ask HN: What’s the Most Outrageous Belief You’re Confident Is True?

2021-12-30T02:57:49.000Z

How to Talk About Autism Respectfully

2021-12-29T22:40:19.000Z

Ask HN: What are some good rust code to read to learn the language?

2021-12-29T10:08:43.000Z

The Container Throttling Problem

2021-12-26T08:42:48.000Z

The Cost of Cloud

2021-12-25T23:16:42.000Z

Typejuice: Docs generator for .d.ts files inspired by godoc

2021-12-25T05:21:37.000Z

ADHD Accommodations Guide

2021-12-24T23:57:35.000Z

Tinyssh

2021-12-23T12:27:09.000Z

Tuple: Pair Programming Tool for macOS

2021-12-21T20:46:19.000Z

Lottie – Use after effects animations in web and native apps

2021-12-21T05:08:48.000Z

Ask HN: Recommended Domain Registrars?

2021-12-21T03:15:30.000Z

Display-switch: Turn a $30 USB switch into a full-featured multi-monitor KVM

2021-12-18T22:00:29.000Z

Postgres is a great pub/sub and job server (2019)

2021-12-17T22:27:35.000Z

SaaS Pricing Manifesto

2021-12-16T04:14:12.000Z

Ask HN: How do you manage direct updates to databases in a production system

2021-12-15T08:07:20.000Z

Plans you're not supposed to talk about

2021-12-14T17:34:37.000Z

Ask HN: Long-form info-dense videos on YouTube you would recommend to anyone?

2021-12-11T19:44:57.000Z

Buf raises $93M to deprecate REST/JSON

2021-12-08T20:20:42.000Z

Curiosities in Vinyl

2021-12-08T07:41:54.000Z

You Can't Buy Integration

2021-12-07T21:20:19.000Z

Google 20% time volunteers have been rewriting the ITA Matrix flight search app

2021-12-03T01:47:43.000Z

Anti-patterns when building container images

2021-11-30T13:30:12.000Z

Event Sourcing Is Hard

2021-11-30T09:41:26.000Z

PostgREST v9.0.0

2021-11-30T06:38:17.000Z

4x Smaller, 50x Faster

2021-11-30T01:40:22.000Z