Debugging WordPress Media imports

If you’re working with a complicated site then you may find that importing files and posts into WordPress is a bit of a black art – there are lots of places where things might go wrong. If the upload_url_path value for your site has been manually changed, then this might cause problems with WP export files.

Q: why would you change upload_url_path?

A: changing upload_url_path lets WordPress serve images etc. from a subdomain or different domain, which should improve site performance if you have an image-heavy site.

This is an example item entry from the WordPress eXtended RSS export file (this is a simplified entry, with most of the post data (comment status, stickiness, time, post ID etc. removed for clarity).

<item>
	<title><![CDATA[Triplets]]></title>
	<link>https://iso200.com/photo-blog/signs/neon-triplets/attachment/dsc_6318/</link>
	<pubDate>Fri, 15 May 2009 20:20:12 +0000</pubDate>
	<dc:creator><![CDATA[dlf]]></dc:creator>
	<guid isPermaLink="false">http://iso200.com/simple/wp-content/uploads/2009/05/dsc_63181.jpg</guid>
	<description></description>
	<content:encoded><![CDATA[neon, sign, Rue de la Paix, Paris, France]]></content:encoded>
	<excerpt:encoded><![CDATA[Rue de la Paix, Paris, France]]></excerpt:encoded>
	<wp:post_type><![CDATA[attachment]]></wp:post_type>
	<wp:attachment_url><![CDATA[//content.iso200.com/2009/05/dsc_63181.jpg]]></wp:attachment_url>
	<wp:postmeta>
	<wp:meta_key><![CDATA[_wp_attached_file]]></wp:meta_key>
	<wp:meta_value><![CDATA[2009/05/dsc_63181.jpg]]></wp:meta_value>
	</wp:postmeta>
</item>

The problem is with this line:

<![CDATA[//content.iso200.com/2009/05/dsc_63181.jpg

The attachment url line doesn’t contain a protocol – either http or https – which is why my media import failed. A protocol isn’t needed when you change the upload_url_path value – but it is needed when WP is importing media.

Solutions:

1. Post-export: grep the file – add a transport protocol to the url – i.e. http: or https: Don’t forget to add the : after the protocol – the wp:attachment_url line should now look like this:

<![CDATA[https://content.iso200.com/2009/05/dsc_63181.jpg

2. Pre-export: check your media library settings and make sure a protocol is included:

NB: you only see these options in your Media Settings page if you have changed the value of upload_url_path.

Removing auto-generated WordPress thumbnail images

If you have folders full of intermediate thumbnail sizes that you don’t want, you can easily select and delete all the thumbnails WordPress has generated using the right tools. This is particularly useful if you change thumbnail sizes and want to get rid of old images.

If you have a Mac, use ‘Find Any File‘ and use a regex pattern to search for images: -(\d{2,4})x(\d{2,4})\.(jpg|jpeg|png|gif)

The regex will select file names that end with a dash followed by thumbnail width by thumnbmail height and the usual image file name extensions. This is the standard WP thumbnail naming format.

‘Find Any File’ will give you a window of images and you can manage/delete as you want.

If you type an incorrect regex (search) string in, the app will helpfully flag the error:

If you have repeatedly imported a test file into a development server, this may leave you with a number of additional duplicate full-size images in your uploads folder – the regex above will not remove these though.

This string has found a lot of duplicates for me – but not all – but be aware that it may select images that you don’t want, depending on their naming format: (\d)-(\d)\.(jpg|jpeg|png|gif)

TIP:

When you import a test file of WP posts (etc.) into a development server create a new Uploads folder each time you do a test import. This avoids the problem of having duplicate original files.