Diffing Pretty-Printed JSON Files

As a lot of tools are now using JSON as their configuration format, we will inevitably need to compare differences between files.

But it can be quite difficult to see what's going on, especially without using an online tool (which may be quite risky depending on the JSON you're comparing).

Let's use two examples for how we can diff two different files.

For these examples, I would strongly recommend using a diff tool that allows for side-by-side view, such as diff -y or vimdiff.

Simple example

Firstly, let's use a more straightforward example.

1.json
{
    "key": [
        123,
        456
    ],
    "key2": "value"
}
2.json
{
    "key": [
        456
    ],
    "key2": "value"
}

We can utilise one of the solutions documented in my pretty-print-json series to pretty-print the JSON before diffing it, to try and make it more readable:

$ diff -u <(python -mjson.tool 1.json) <(python -mjson.tool 2.json)

This gives us the following output, which is a little clearer:

--- /proc/self/fd/11	2020-08-24 19:22:51.513741484 +0100
+++ /proc/self/fd/13	2020-08-24 19:22:51.513741484 +0100
@@ -1,7 +1,6 @@
 {
   "key": [
-    456,
-    123
+    456
   ],
   "key2": "value"
 }

More complex example

However, the above is a bad example, as it's not super realistic, as we generally have a large, nested document, as well as the keys being in different orders:

1.json
{
  "Resources": {
    "Ec2Instance": {
      "Type": "AWS::EC2::Instance",
      "Properties": {
        "SecurityGroups": [
          {
            "Ref": "InstanceSecurityGroup"
          },
          "MyExistingSecurityGroup"
        ],
        "KeyName": {
          "Ref": "KeyName"
        },
        "ImageId": "ami-7a11e213"
      }
    },
    "InstanceSecurityGroup": {
      "Type": "AWS::EC2::SecurityGroup",
      "Properties": {
        "GroupDescription": "Enable direct HTTP access",
        "SecurityGroupIngress": [
          {
            "IpProtocol": "https",
            "FromPort": "443",
            "ToPort": "443",
            "CidrIp": "0.0.0.0/0"
          }
        ]
      }
    }
  }
}
2.json
{
  "Parameters": {
    "KeyName": {
      "Description": "The EC2 Key Pair to allow SSH access to the instance",
      "Type": "AWS::EC2::KeyPair::KeyName"
    }
  },
  "Resources": {
    "Ec2Instance": {
      "Type": "AWS::EC2::Instance",
      "Properties": {
        "SecurityGroups": [
          {
            "Ref": "InstanceSecurityGroup"
          },
          "MyExistingSecurityGroup"
        ],
        "KeyName": {
          "Ref": "KeyName"
        },
        "ImageId": "ami-7a11e213"
      }
    },
    "InstanceSecurityGroup": {
      "Type": "AWS::EC2::SecurityGroup",
      "Properties": {
        "GroupDescription": "Enable SSH access via port 22",
        "SecurityGroupIngress": [
          {
            "ToPort": "22",
            "FromPort": "22",
            "CidrIp": "0.0.0.0/0",
            "IpProtocol": "tcp",
            "Description": "Only found on 2.json"
          }
        ]
      }
    }
  }
}

If we were to use the above diff example, we'd end up with quite a few lines showing as diffs, even though they've actually got a lot in common:

--- /proc/self/fd/11	2020-08-24 20:34:48.645452666 +0100
+++ /proc/self/fd/13	2020-08-24 20:34:48.648785937 +0100
@@ -1,4 +1,10 @@
 {
+    "Parameters": {
+        "KeyName": {
+            "Description": "The EC2 Key Pair to allow SSH access to the instance",
+            "Type": "AWS::EC2::KeyPair::KeyName"
+        }
+    },
     "Resources": {
         "Ec2Instance": {
             "Type": "AWS::EC2::Instance",
@@ -18,13 +24,14 @@
         "InstanceSecurityGroup": {
             "Type": "AWS::EC2::SecurityGroup",
             "Properties": {
-                "GroupDescription": "Enable direct HTTP access",
+                "GroupDescription": "Enable SSH access via port 22",
                 "SecurityGroupIngress": [
                     {
-                        "IpProtocol": "https",
-                        "FromPort": "443",
-                        "ToPort": "443",
-                        "CidrIp": "0.0.0.0/0"
+                        "ToPort": "22",
+                        "FromPort": "22",
+                        "CidrIp": "0.0.0.0/0",
+                        "IpProtocol": "tcp",
+                        "Description": "Only found on 2.json"
                     }
                 ]
             }

So instead, I'd recommend reaching for something that can sort the JSON documents to make it a bit easier semantically, such as this JSON script I've written (as an aside, this is based on the fact that JSON keys should not be order-dependent - if yours are, you'll not have a great time).

This means that we can run the following:

$ diff -u <(ruby sort-keys.rb 1.json) <(ruby sort-keys.rb 2.json)

This gives us the following, slightly easier to understand, output:

--- /proc/self/fd/11	2020-08-24 20:10:11.630671779 +0100
+++ /proc/self/fd/13	2020-08-24 20:10:11.630671779 +0100
@@ -1,4 +1,10 @@
 {
+  "Parameters": {
+    "KeyName": {
+      "Description": "The EC2 Key Pair to allow SSH access to the instance",
+      "Type": "AWS::EC2::KeyPair::KeyName"
+    }
+  },
   "Resources": {
     "Ec2Instance": {
       "Properties": {
@@ -17,13 +23,14 @@
     },
     "InstanceSecurityGroup": {
       "Properties": {
-        "GroupDescription": "Enable direct HTTP access",
+        "GroupDescription": "Enable SSH access via port 22",
         "SecurityGroupIngress": [
           {
             "CidrIp": "0.0.0.0/0",
-            "FromPort": "443",
-            "IpProtocol": "https",
-            "ToPort": "443"
+            "Description": "Only found on 2.json",
+            "FromPort": "22",
+            "IpProtocol": "tcp",
+            "ToPort": "22"
           }
         ]
       },

Written by Jamie Tanna's profile image Jamie Tanna on , and last updated on .

Content for this article is shared under the terms of the Creative Commons Attribution Non Commercial Share Alike 4.0 International, and code is shared under the Apache License 2.0.

#blogumentation #json #pretty-print #command-line.

This post was filed under articles.

This post is part of the series pretty-print-json.

Interactions with this post

Interactions with this post

Below you can find the interactions that this page has had using WebMention.

Have you written a response to this post? Let me know the URL:

Do you not have a website set up with WebMention capabilities? You can use Comment Parade.