Quite often, the task of investigating voodoo wizardry behind asset pipeline issues falls to me - a case in point is this previous blog post where software versioning was the issue. Another time, it was faulty vendoring.
The usual place to start is to see what files are actually different on the servers whose assets are not the same. At the very least, the application folder (under the environment variable $STACK_PATH on Cloud 66 Rails stacks) and the gem installation path should be checked.
Of course, running diff file by file is not acceptable - that is far too time consuming. Enter the wonders of bash!
The following one liner should do the trick! It will tell you the SHA checksum for all the files of the subdirectories in the directory you're in. (See the appendix for a thorough explanation!)
For example, given the following output for ls -al in my application folder:
The command gives the following output:
I can now compare values across the servers and see which subdirectories have different contents.
If the SHA checksum is different for a given subdirectory, I can go into that subdirectory and repeat the SHA command. If all the subdirectories have the same SHA checksums, but the folder I am in (meaning . in the output above) has a different checksum, then the files in . are different. To find which file it is, run a simplified version of the SHA command that looks for files without going into subdirectories:
I hope this helps, and I wish you all happier debugging times!
Appendix
Command Explanation
Bash one-liners without comments are almost impossible to figure out. As a matter of fact, the explanation for the above command is already slowly but surely slipping from my brain - let's write it down!
find "$DIRECTORY" is a wonderful command that I use often - as suggested, it will output a list of all files and directories in "$DIRECTORY". By default, it will recursively dig into all the subdirectories.
Adding the -L flag to find before "$DIRECTORY" will make it follow symlinks. This is necessary if you have symlinks to directories, but will break if you have circular symlinks - use as required.
Flags for find after "$DIRECTORY" are usually to specify what type of things you are searching for. -maxdepth 1 means I only want to list only immediate subdirectories of "$DIRECTORY" without going deeper, and -type d means I want only directories, not files. -print0 will make the delimiter between entries the \0 character instead of \n - it will become obvious why once we handle xargs.
So far, we have find -L "$DIRECTORY" -maxdepth 1 -type d -print0, which will list all immediate subdirectories of "$DIRECTORY". We will pipe this through sort -z for consistency across runs - the -z flag tells sort that the delimiter is the \0 character.
Then we pipe to xargs which will pass each argument from the preceeding command to the one that follows. In essence, we are passing each subdirectory to the bash command that follows. The -0 flag tells xargs that the delimiter between arguments is the \0 character. Otherwise, the default delimiter is the space character, which would break for subdirectories with spaces! Finally, the -I{} flag will tell xargs pass arguments to where it sees {} - otherwise by default it will pass arguments to the end of the command.
So now, we are passing each subdirectory to the command inside bash -c "..." - the reason we have to call bash is because we are passing each subdirectory to a command that contains pipes. Without it, the pipe would apply to the whole input.
The command inside bash -c "..." is similar to the previous one, except now we are finding -type f - that is files, and there is no -maxdepth. This will get the SHA checksum for each file in the subdirectory, and output it in the following format:
We will now cut only the SHA checksum from this output using cut, which can be used to give you a specific column from an output. -d ' ' means that the delimiter between columns is the space character, and -f1 means I want the first column - the SHA checksum. This is so that the file name does not interfere with the next shasum in the case of a directory symlink.
We then pipe all the file SHA checksums through a shasum to get the overall SHA checksum for that subdirectory!
The final paste command is simply used to append the file name to the subdirectory SHA checksums. Without it, the output would look like
since we passed through a string of SHA checksums instead of a file in the final shasum.
Normally, paste works given two files. Given two files, it will append each row in one to each row in the other. Since we are piping to paste though, we can use the - character to mean 'whatever came from the pipe'. Finally, I want the output of find -L . -maxdepth 1 -type d | sort to be appended to whatever came from the pipe (the SHA checksums). But I don't want to create a seperate file for this purpose. I can get around this by using bash process substitution, denoted by <(...). This will make the output of ... to be placed in a temporary file in /dev/fd or a named pipe (FIFO), which can be used in the same way as a file normally is.
