Docker: Error response from daemon: manifest not found: manifest unknown

I was seeing the rather character dense and yet information sparse error from Docker:

Error response from daemon: manifest for graylog/graylog:latest not found: manifest unknown: manifest unknown

Yes, I was hacking around with Graylog in this specific instance.

As it turns out, Graylog doesn’t have a latest tag on Dockerhub, and Docker will add :latest to any image that you attempt to pull without explicitly adding a tag.

What happens if there’s no :latest tag on the registry? You get the above error. Search your container registry and repo for what tags they use and find the one that makes most sense for you.

Solving Kubectl “Error from server (InternalError): an error on the server (“”) has prevented the request from succeeding”

My Problem

When switching to a Linode Kubernetes Engine (LKE) cluster context, any command such as kubectl get pods or kubectl cluster-info hangs for about a minute before ultimately showing the following error:

Error from server (InternalError): an error on the server ("") has prevented the request from succeeding

My Solution

It’s super simple. Check your kubectl config view and make sure that your authentication information is accurate. In my case the user token was wrong since I had been bringing up and tearing down LKE clusters and forgot to change my token. The error could probably be a bit more verbose or otherwise narrow the context down a bit, but alas.

The Long Story

Incidentally, I was running Windows 10 and running kubectl from PowerShell, but that doesn’t seem to be germane to the situation.

Running kubectl system-info --v=10 provided a ton of information. Note that --v is perhaps underdocumented (or was at one point).

What I found was that I was getting numerous: Got a Retry-After 1s response for attempt 8 to https://my-cluster:443/api?timeout=32s until the whole request timed out. I checked my Linode control panel and the cluster was indeed up and running.

The whole thing smelled like some kind of auth issue to me, so I double checked the kubectl config file that Linode offers in the UI (and via API), and noticed that the tokens weren’t matching with what I had in my .kube/config file. It was then that I remembered I had been tearing down and re-creating k8s clusters via Terraform and had forgotten to update my config file with the proper user token. Oh the joys of late-night hacking.

Once I updated my config file, I was able to access kubernetes.

Solving Terraform: “No valid credential sources found for AWS Provider”

My Problem

Using Terraform v0.12 and attempting to use the AWS provider to init an S3 backend, I’m receiving the error:

Initializing the backend…

Error: No valid credential sources found for AWS Provider.
Please see https://terraform.io/docs/providers/aws/index.html for more information on providing credentials for the AWS Provider

I’m experimenting with providing static credentials in a .tf file (P.S. don’t do this in production) and I’ve verified that the AWS keys are correct.

My Solution

Preamble: The following is terrible, don’t do this. I’m writing this merely as an answer to something that was puzzling me.

Add access_key and secret_key to the Terraform backend block. E.g.:

terraform {
  backend "s3" {
    bucket = "your-bucket"
    region = "your-region"
    key = "yourkey/terraform.tfstate"
    dynamodb_table = "your-lock-table"
    encrypt = true
    access_key = "DONT_PUT_KEYS_IN_YOUR.TF_FILES"
    secret_key = "NO_REALLY_DONT"
  }
}

This would be in addition to the keys that you’ve placed in your provider block:

provider "aws" {
   region = "us-east-1"
   access_key = "DONT_PUT_KEYS_IN_YOUR.TF_FILES"
   secret_key = "NO_REALLY_DONT"
 }

The backend needs to be initialized before the provider plugin, so any keys in the provider block are not evaluated. The Terraform backend block needs to be provided with its own keys.

A better method for doing that would be using environmental variables, among other more secure methods (including the use of shared_credentials_file and a profile, such as what Martin Hall references in the comments below. You can also provide a partial configuration and then pass variables in via the command line.

The Long Story

There are a number of ways to provide Terraform with AWS credentials. The worst option is to use static credentials provided in your .tf files, so naturally that’s what I’m experimenting with.

One way to provide credentials is through environmental variables, and when I tested that method out, it worked! I’ll make use of environmental variables in the future (promise), but I want to figure out why static credentials aren’t working because… because.

Another way to provide AWS credentials is via the good ol’ shared credentials file located at .aws/credentials. Again, this works in my scenario but I’m stumped as to why static credentials won’t.

(Side note: At this point in the story, this is the universe telling me just how bad it is to use static credentials, but my preferred decision making methodology is to ignore such urgings.)

Let’s debug this sucker by setting the environmental variable TF_LOGS to trace: export TF_LOGS=trace

# terraform init
2020/05/21 06:26:58 [INFO] Terraform version: 0.12.25
2020/05/21 06:26:58 [INFO] Go runtime version: go1.12.13
2020/05/21 06:26:58 [INFO] CLI args: []string{"/usr/bin/terraform", "init"}
2020/05/21 06:26:58 [DEBUG] Attempting to open CLI config file: /root/.terraformrc
2020/05/21 06:26:58 [DEBUG] File doesn't exist, but doesn't need to. Ignoring.
2020/05/21 06:26:58 [INFO] CLI command args: []string{"init"}

Initializing the backend…

2020/05/21 06:26:58 [TRACE] Meta.Backend: built configuration for "s3" backend with hash value 953412181
2020/05/21 06:26:58 [TRACE] Preserving existing state lineage "da125f8e-6c56-d65a-c30b-77978250065c"
2020/05/21 06:26:58 [TRACE] Preserving existing state lineage "da125f8e-6c56-d65a-c30b-77978250065c"
2020/05/21 06:26:58 [TRACE] Meta.Backend: working directory was previously initialized for "s3" backend
2020/05/21 06:26:58 [TRACE] Meta.Backend: using already-initialized, unchanged "s3" backend configuration
2020/05/21 06:26:58 [INFO] Setting AWS metadata API timeout to 100ms
2020/05/21 06:27:00 [INFO] Ignoring AWS metadata API endpoint at default location as it doesn't return any instance-id
2020/05/21 06:27:00 [INFO] Attempting to use session-derived credentials

Error: No valid credential sources found for AWS Provider.
Please see https://terraform.io/docs/providers/aws/index.html for more information on providing credentials for the AWS Provider

Huh, it’s as if the backend section is totally ignoring my provider credentials.

It was then that I realized that the backend block has its own variables for keys. Well that’s weird. Why would it need its own definition of my provider’s keys when I already have keys placed in the “aws” provider block? Unless… Terraform doesn’t look at that block.

Some further research confirms that when a terraform backend is init’d, it’s executed before just about anything else (naturally), and there’s no sharing of provider credentials from a provider block even if the backend resides in the provider (E.g. a backend that uses Amazon S3 will not look to the AWS provider block for credentials).

Once I placed my AWS keys in the terraform backend block (don’t do that), things worked.

Adding Simple base64 Decoding to Your Shell

I had a need to repeatedly decode some base64 strings quickly and easily. Easier than typing out openssl base64 -d -in -out, or even base64 --decode file.

The simplest solution that I found and prefer is a shell function with a here string. Crack open your preferred shell’s profile file. In my case, .zshrc. Make a shell function thusly:

decode() {
  base64 --decode <<<$1
}

Depending on your shell and any addons, you may need to echo an extra newline to make the decoded text appear on its own line and not have the next shell prompt append to the decoded text.

Solved: Getting Backblaze to Backup OneDrive Folders in Windows

My Problem

I use Microsoft Office 365 and OneDrive for my consulting work to keep my files synced between multiple devices and preserved from loss should I have my laptop stolen or otherwise destroyed. I use Backblaze as part of my strategy to back up the data and keep version history of my files. This can be a tiny bit tricky since Backblaze can’t back up the files if you have OneDrive “Files On-Demand” turned on. However, once you turn Files On-Demand off, Backblaze should be able to back up the files just like any other file on your hard drive. In theory.

In practice, I was unable to get one particular folder contained within OneDrive to back up to Backblaze. This was a considerable problem because that one particular folder was the main folder that I kept all of my business files in. It was essentially the only folder that I cared deeply about having backed up, and as luck would have it, it was the only folder that wasn’t showing up in my list of files that I could restore from Backblaze.

After considerable work with Backblaze support, we came to the final solution.

My Solution

Reparse points! Check to see if the directory that isn’t being backed up has the ReparsePoint attribute. There are a few ways to do this, but the most plain one that I used was:

> gci|select name,attributes -u

Name                                       Attributes
----                                       ----------
Important Work       Directory, Archive, ReparsePoint
GoProjects                                  Directory
More Work            Directory, Archive, ReparsePoint
Even more work       Directory, Archive, ReparsePoint

As it turns out, OneDrive apparently has a history of changing if and when it marks a directory with the ReparsePoint attribute. Here’s where I have to insert a giant disclaimer:

I don’t know if changing the ReparsePoint attribute manually out from under OneDrive will do anything nasty and prevent OneDrive from working as intended. I also do not know if OneDrive will silently add the ReparsePoint attribute to folders again, thus causing Backblaze backups to silently fail. I’ll be checking this over time, but you should check it for yourself as well.

However, note that changing a directory’s ReparsePoint attribute in this situation will not delete data.

As it turns out, most if not all of my directories under the one crucial directory were marked with the ReparsePoint attribute. My only choice was to recursively check each directory and remove the attribute. If you take such a scorched earth approach, this will very likely tamper with any junctions and/or mount points that you have in that tree of your filesystem, so beware of what that implies for your usage. For me, there were no known negative implications.

My solution was to mass change the troublesome directory with some PowerShell:

Get-ChildItem -recurse -force | Where-Object { $_.Attributes -match "ReparsePoint" } | foreach-object -Process {fsutil reparsepoint delete $_.fullname}

For more information, check out the help document for the fsutil tool. Keep in mind that while the verb delete is scary, it doesn’t actually delete any files or directories, rather it’s simply removing the reparsepoint attribute on the filesystem object.

After that, I forced a rescan of the files that Backblaze should back up (Windows instructions here, and then Mac instructions here). Suddenly thousands of new files were discovered and began uploading. After a little while, I checked Backblaze.com for what files I could restore, and sure enough, the troublesome folder and seemingly all of it’s child items were in my available backup.

I’ll periodically check back on my filesystem to see if any directories were re-marked with ReparsePoint and make note of it here. If I was smart and diligent, I’d make a scheduled task to remove that attribute from the areas of my filesystem that I’m concerned with.

Workaround: “Unable to Change Virtual Machine Power State: Cannot Find a Valid Peer Process to Connect to”

My Problem

Attempting to start a virtual machine in VMware Workstation 15 Pro (15.0.3) on a RedHat based Linux workstation caused the following error: “Unable to Change Virtual Machine Power State: Cannot Find a Valid Peer Process to Connect to”

I was able to start other virtual machines in the VM library, however.

My Workaround

Note that this is simply a workaround. I don’t yet know the ultimate cause, but I’m documenting how I workaround it until I or someone else can figure out the ultimate cause of this problem.

First, check to see if the virtual machine is actually running, in spite of there being no visual indicators within VMware Workstation: vmrun list

You’ll probably see that the virtual machine is running. If you don’t, then this workaround isn’t likely to help you. Attempt to shut the running virtual machine down softly: vmrun stop /path/to/virtual_machine.vmx soft

After that, you should be able to start the machine again, until the next time it crashes for unknown reasons. More news as I discover it.

Dumping Grounds (Turn Back Now):

I’ll dump some of my notes here and they’ll be updated periodically as I find out more info about this issue. You’re completely safe to ignore everything past this point. Abandon all hope, ye who proceed.

I had recently upgraded from Fedora 29 to Fedora 30, and was experiencing some minor instability with my main workstation. I’m not sure if that was the ultimate cause of this issue, but I’m suspicious since I never had this issue until after the upgrade.

My first act was to go to the Help menu, select the “Support” menu and then “Collect Support Data…” I chose to collect data for the specific VM that was having this issue. This took quite a while, by my standards. About 20 minutes. It basically creates a giant zipped dump of pertinent files across your physical machine that pertain to VMware and that specific virtual machine. It’s not super easy to parse and know what to look for.

I searched through /var/log/vmware/ for any clues in any of the log files found therein. Grepping for all files that had the pertinent virtual machine’s name, and looking for surrounding context didn’t turn anything up.

I attempted to start the vmware-workstation-server service but that failed. I don’t think that’s the issue since the virtual machine isn’t a shared VM.

I tried vmrun list and saw that the Windows VM was actually listed as running. I stopped it soft: vmrun stop /path/to/my/virtual_machine.vmx soft and was then able to start the virtual machine. I’m not sure what’s causing the crash, and what’s causing the crash of VMware Workstation Pro, and why when I start it back up it doesn’t appear to know that the VM it was previously working with is actually running.

Solved: “bad input file size” When Attempting to `setfont` to a New Console Font

My Problem

In a Linux distribution of one kind or another, when attempting to set a new console font in a TTY, you may received the following error:

# setfont -32 ter-u32n.bdf
bad input file size

My Solution

First, if you’re coming to this blog post because you’re attempting to install larger Terminus fonts for your TTY, you probably just want to search your distribution’s package manager for Terminus, specifically the console fonts package:

$ yum search terminus
== Name Matched: terminus ==
terminus-fonts.noarch : Clean fixed width font
terminus-fonts-grub2.noarch : Clean fixed width font (grub2 version)
terminus-fonts-console.noarch : Clean fixed width font (console version)
$ yum install terminus-fonts-console

However if you’re coming to this blog post for other reasons, then you’re probably attempting to setfont with a .bdf file or just something that isn’t a .psf file. You most likely need to follow the instructions for your font, in my case Terminus, to make the files into the proper .psf format.The Linux From Scratch project has a good quick primer on the topic that you can use to mine for search terms and further information.

With my specific font, what worked for me was:

$ sudo ./configure --psfdir=/usr/lib/kbd/consolefonts
$ sudo make -j8 psf
# Stuff happens here
$ sudo make install-psf

After that, I had the fonts installed into my /usr/lib/kbd/consolefonts directory and was able to setfont and further change my TTY font to my preferences.

Solved: Attempting to Install and Configure Wireguard Fails with “Unknown device type” and “FATAL: Module wireguard not found in directory”

My Problem

Attempting to install and use Wireguard (version 0.0.20190406-1) on Fedora release 29 is unsuccessful with a variety of symptoms. The first being:

ip link add dev wg0 type wireguard
Error: Unknown device type.

Attempting to get some info about the module with modprobe shows:

$ modprobe wireguard
modprobe: FATAL: Module wireguard not found in directory /lib/modules/5.0.4-2004

The dkms tool shows that the wireguard module is added:

$ dkms status
wireguard, 0.0.20190406: added

However, attempting to build it shows:

$ dkms build wireguard/0.0.20190406
Error! echo
Your kernel headers for kernel 5.0.4-200.fc29.x86_64 cannot be found at /lib/modules/5.0.4-200.fc29.x86_64/build or /lib/modules/5.0.4-200.fc29.x86_64/.

My Solution

Make sure that your running kernel and your kernel headers are the same version, or at least that the running version of the kernel is newer than your kernel headers.

For example, I’m running on a RedHat based system, and checked the following:

$ uname --kernel-release
5.0.4-200.fc29.x86_64

But then the kernel headers were newer:

$ rpm -q kernel-headers
kernel-headers-5.0.9-200.fc29.x86_64

My solution was to yum update the kernel and reboot. I didn’t have to re-install the headers or the wireguard packages. Another possible solution would have been to manually install 5.0.4 kernel headers, but that would require uninstalling packages that marked 5.0.9 kernel headers as a dependency. I believe the cleaner solution is to simply update the kernel.

The Long Story

First, I checked that I even had kernel headers installed in the first place:

$ rpm -q kernel-headers
kernel-headers-5.0.9-200.fc29.x86_64

Well that’s interesting, because:

$ uname --kernel-release
5.0.4-200.fc29.x86_64

So I’m running kernel 5.0.4, but the kernel-headers package that I’m offered is for 5.0.9. I attempted to install the specific kernel header package by version:

yum install kernel-headers-5.0.4-200.fc29.x86_64
[...]
No match for argument: kernel-headers-5.0.4-200.fc29.x86_64

At this point, I had two viable options.

  1. I could update the running kernel, since 5.0.10-200.fc29 was released and waiting for me.
  2. I could go into Fedora’s build system, Koji, and pull out the specific kernel headers package that I needed to then install manually.

Choosing #2, however, would require me to uninstall the current 5.0.9 kernel headers, and anything that had it as a dependency. This includes things like binutils and gcc, among many others. I decided to update the system. A quick yum update and reboot later, and:

$ uname -or
5.0.10-200.fc29.x86_64 GNU/Linux

My only concern was that the headers that are in the official yum repo are 5.0.9; a minor version behind the new kernel:

rpm -q kernel-headers
kernel-headers-5.0.9-200.fc29.x86_64

Nevertheless, my fears were allayed with dkms:

$ dkms status
wireguard, 0.0.20190406, 5.0.10-200.fc29.x86_64, x86_64: installed

Previously, wireguard had only been added, but not successfully installed. I quickly tried to add a wireguard interface:

$ ip link add dev wg0 type wireguard
$ ip link show wg0
3: wg0: <POINTOPOINT,NOARP> mtu 1420 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/none

Success!

Back to the Startup and Freelance Scene

2014 was a watershed year for me. I had been a full time freelancer since 2010, learning a wide variety of technologies as well as the intricacies of developing a consulting business.

However, I had an opportunity to get into an exciting Y-Combinator startup that I couldn’t pass up. In May of 2014, I joined MongoHQ, a database as a service company focusing on hosted MongoDB deployments. We later rebranded to Compose when we started hosting more than MongoDB. First with Elasticsearch, and then PostgreSQL, Redis, and more.

In 2015 we were acquired by IBM, and it was an amazing ride through the acquisition and integration process. We added more people, obtained new and loftier goals, and had a great time succeeding in a totally new environment.

After three years of working on the Compose product within IBM, I couldn’t ignore the pull back to the startup and freelance scene. As of Oct 15th, 2018 I’m back to the wilds of uncertainty and terror excitement. I’m taking up residence at a coworking space / startup incubator called Galvanize, specifically in their Phoenix campus

I’m back to freelancing and startup work, and perhaps hunting a unicorn. Or at least a very pretty horse. If anyone is currently on the same path, I’d love to talk with you and see how things are going in 2018 and on into 2019. If anyone is in need of a consultant / contractor / freelancer who can poke a bit at AWS, Azure, MongoDB, Elasticsearch, Redis, PostgreSQL, JavaScript, Ruby, Go, Linux, and a litany of other trendy and cloudy technologies, reach out: me@wesley.sh

If anyone is specifically in the Phoenix, Arizona startup scene, stop by Galvanize and let’s grab some lunch and talk about sunburns and hiking. =)

Solved: WordPress – “An unexpected error occurred.” when installing plugins, themes, and more.

My Problem

Attempting to add things to WordPress like plugins or themes causes the following error:

An unexpected error occurred. Something may be wrong with WordPress.org or this server’s configuration. If you continue to have problems, please try the support forums.

My Solution

Check your SELinux audit logs for signs of denials. Your web server software (probably Apache / httpd) or a module being used by the software is most likely having outbound connection attempts denied.

Continue reading Solved: WordPress – “An unexpected error occurred.” when installing plugins, themes, and more.