Merging an 'Override' Ruby Hash into the Original Hash

Featured image for sharing metadata for article

I have a very specific use case for this, so feel free to discount this if it doesn't make much sense to you.

I'm using Chef to configure a Java JAR file for a few of my personal backend services. I have a few differences between my local testing environment and production, such as the secrets being used and the path to the JAR to be deployed.

I want to make it possible to configure these differences in the least amount of JSON.

For instance, the current Chef JSON file has a large set of configuration in it (trimmed for brevity):

{
  "jar": {
    "user": "jar",
    "group": "jar",
    "directory": "/var/jar",
    "jar_uri": "https://repo.remote.example.com/jar.jar",
    "secrets": [
      {
        "name": "gitlab.apiKey",
        "vault_path": "jar/gitlabApiKey",
        "type": "String"
      },
      {
        "name": "indieAuth.accessToken",
        "vault_path": "jar/indieAuthAccessToken",
        "type": "String"
      }
    ]
  },
  "run_list": [
    "jar-deploy-cookbook::default"
  ]
}

I'd love a way to utilise the pre-built JSON file, but provide an ability to override this config, in such a way:

{
  "jar": {
    "secrets": [
      {
        "name": "gitlab.apiKey",
        "hardcoded_secret": "too-easy-to-guess"
      },
      {
        "name": "indieAuth.accessToken",
        "hardcoded_secret": "fakeyMcFakerson"
      }
    ]
  }
}

This is because I'm lazy and don't want to repeat all the JSON config, but I also want to keep it as close to the production config, removing any risk of configuration drift. But there are some difficulties with trying to override these nested bits of configuration, as I don't want to have to repeat all the properties of a given nested Hash i.e. the type for each secret.

So my requirements are:

  • allow me to override certain fields within a Hash
  • support Hashes nested within arrays (all the way down!)
  • allow me to completely override an array if possible
  • allow me to pass a null for a given key

The code

So how do we do this? We apply this StackOverflow answer, but tweak this quite a bit, so we can work with the above requirements:

class ::Hash
  def deep_merge_override(second)
    merger = proc do |key, original, override|
      if original.instance_of?(Hash) && override.instance_of?(Hash)
        original.merge(override, &merger)
      else
        if original.instance_of?(Array) && override.instance_of?(Array)
          # if the lengths are different, prefer the override
          if original.length != override.length
            override
          else
            # if the first element in the override's Array is a Hash, then we assume they all are
            if override[0].instance_of?(Hash)
              original.map.with_index do |v, i|
                # deep merge everything between the two arrays
                original[i].merge(override[i], &merger)
              end
            else
              # if we don't have a Hash in the override,
              # override the whole array with our new one
              override
            end
          end
        else
          override
        end
      end
    end
    self.merge(second.to_h, &merger)
  end
end
Diff between the two files

If it's more helpful, here's the diff:

 class ::Hash
   def deep_merge_override(second)
     merger = proc do |key, original, override|
       if original.instance_of?(Hash) && override.instance_of?(Hash)
         original.merge(override, &merger)
       else
         if original.instance_of?(Array) && override.instance_of?(Array)
-          original | override
-        else
-          if [:undefined, nil, :nil].include?(override)
-            original
-          else
+          # if the lengths are different, prefer the override
+          if original.length != override.length
             override
+          else
+            # if the first element in the override's Array is a Hash, then we assume they all are
+            if override[0].instance_of?(Hash)
+              original.map.with_index do |v, i|
+                # deep merge everything between the two arrays
+                original[i].merge(override[i], &merger)
+              end
+            else
+              # if we don't have a Hash in the override,
+              # override the whole array with our new one
+              override
+            end
           end
+        else
+          override
         end
       end
     end
     self.merge(second.to_h, &merger)
   end
 end
-

Testing

So how does this actually do? Does this handle the scenarios we want? (Spoiler alert: yes!) Let's go through the scenarios I wanted.

Overriding certain fields

If I want to override the user, I would have the following override.json:

{
  "jar": {
    "user": "foobar"
  }
}

Then we can see this overrides just the user property:

{
  "jar": {
    "user": "foobar",
    "group": "jar",
    "directory": "/var/jar",
    "jar_uri": "https://repo.remote.example.com/jar.jar",
    "secrets": [
      {
        "name": "gitlab.apiKey",
        "vault_path": "jar/gitlabApiKey",
        "type": "String"
      },
      {
        "name": "indieAuth.accessToken",
        "vault_path": "jar/indieAuthAccessToken",
        "type": "String"
      }
    ]
  },
  "run_list": [
    "jar-deploy-cookbook::default"
  ]
}

Completely Override an Array

To override the Chef recipes for this run:

{
  "run_list": [
    "spectat::local",
    "jar-deploy-cookbook::default"
  ]
}

Which leaves everything else untouched:

{
  "jar": {
    "user": "jar",
    "group": "jar",
    "directory": "/var/jar",
    "jar_uri": "https://repo.remote.example.com/jar.jar",
    "secrets": [
      {
        "name": "gitlab.apiKey",
        "vault_path": "jar/gitlabApiKey",
        "type": "String"
      },
      {
        "name": "indieAuth.accessToken",
        "vault_path": "jar/indieAuthAccessToken",
        "type": "String"
      }
    ]
  },
  "run_list": [
    "spectat::local",
    "jar-deploy-cookbook::default"
  ]
}

Nested Hash Arrays

The most important scenario is being able to override a specific field in a Hash within an Array. For instance, if I want to specify that the secret should be hardcoded, not via Vault:

{
  "jar": {
    "secrets": [
      {
        "vault_path": null,
        "hardcoded_secret": "too-easy-to-guess"
      },
      {
        "vault_path": null,
        "hardcoded_secret": "fakeyMcFakerson"
      }
    ]
  }
}

Note that the ordering here is important, and needs to align with the secrets in the existing configuration. We don't need to put all the fields, just the ones we want to override.

In this case, we want to unset the vault_path in each (which the cookbook ignores if it is null), and add a new field hardcoded_secret.

This gives us the result:

{
  "jar": {
    "user": "jar",
    "group": "jar",
    "directory": "/var/jar",
    "jar_uri": "https://repo.remote.example.com/jar.jar",
    "secrets": [
      {
        "name": "gitlab.apiKey",
        "vault_path": "jar/gitlabApiKey",
        "type": "String"
      },
      {
        "name": "indieAuth.accessToken",
        "vault_path": "jar/indieAuthAccessToken",
        "type": "String"
      }
    ]
  },
  "run_list": [
    "jar-deploy-cookbook::default"
  ]
}

Symbols and Strings

However, there was something I didn't expect. I'm not overriding this with a separate JSON file, but am instead overriding this configuration using a Ruby Hash in my Vagrantfile. This is loaded in a way similar to:

require 'json'

original = JSON.parse(File.read './original2.json')

override = {
  jar: {
    secrets: [
      {
        vault_path: nil,
        hardcoded_secret: 'too-easy-to-guess'
      },
      {
        vault_path: nil,
        hardcoded_secret: 'fakeyMcFakerson'
      }
    ]
  }
}

merged = original.deep_merge_override(override)

This gives us the puzzling output below, which has two jar hashes:

{
  "jar": {
    "user": "jar",
    "group": "jar",
    "directory": "/var/jar",
    "jar_uri": "https://repo.remote.example.com/jar.jar",
    "secrets": [
      {
        "name": "gitlab.apiKey",
        "vault_path": "jar/gitlabApiKey",
        "type": "String"
      },
      {
        "name": "indieAuth.accessToken",
        "vault_path": "jar/indieAuthAccessToken",
        "type": "String"
      }
    ]
  },
  "run_list": [
    "jar-deploy-cookbook::default"
  ],
  "jar": {
    "secrets": [
      {
        "vault_path": null,
        "hardcoded_secret": "too-easy-to-guess"
      },
      {
        "vault_path": null,
        "hardcoded_secret": "fakeyMcFakerson"
      }
    ]
  }
}

This will likely fail the Chef run, or at least confuse it, or even worse it'll silently work until an underlying change in the JSON parser.

The solution here is to follow the instructions in my post Converting Ruby Hash keys to Strings/Symbols, making sure that we stringify all the keys, so they get merged correctly.

Written by Jamie Tanna's profile image Jamie Tanna on , and last updated on .

Content for this article is shared under the terms of the Creative Commons Attribution Non Commercial Share Alike 4.0 International, and code is shared under the Apache License 2.0.

#blogumentation #ruby.

This post was filed under articles.

Interactions with this post

Interactions with this post

Below you can find the interactions that this page has had using WebMention.

Have you written a response to this post? Let me know the URL:

Do you not have a website set up with WebMention capabilities? You can use Comment Parade.