Occasionally, a post that I write is deemed interesting enough by another blogger to be quoted. This generates an incoming link that appears among the article’s comments. I don’t mind this at all. In fact, I think it’s good when bloggers read one another’s posts and take up the discussion. So if you are an honest blogger, you may quote me on that and on anything else!
However (and there’s always a however in this wicked world), there are those who quote my blog – and yours, and anyone else’s – for less innocent reasons. This practice has become so prevalent as to rival spamming as a nuisance and has attracted the ire of the blogging community. As a result, it has been given a name, blog scraping, and the blogs based on it are referred to by the term splogs.
As far as I can make out, there are two main sorts of splogs. They all have one thing in common (with the odd exception), namely that their authors (if they deserve such a name) use automatic software to cruise the blogosphere and seize on certain articles according to the keywords in them.
Let me give you an example. A few weeks back, I wrote a couple of posts about my hearing aids. These were immediately “scraped” by a site claiming to aggregate information on hearing aids. However, the word “hearing” also appeared in a later post of mine, this time without any reference to hearing aids, and that post, too, was scraped by the same splog. This lack of finesse shows that automatic software is being used.
The different sorts of splog that I know of (though there may be others) are, firstly, those that quote the whole of a post, often without acknowledging the source (i.e. you or your blog) or even assigning it to a false authorship and, secondly, those that quote a few lines of an article, following these with a “Read more…” link to the original post. What these splogs tend to have in common, apart from a rather dull format, is a complete lack of any contact information, making it difficult to express your displeasure at their actions.
How do you know you have been “scraped”? An essential part of the scraping process is to generate an incoming link to your blog. You will see this, either in the comments section at the end of a post or among the comments held for review as possible spams by the blogging software. The idea behind this, I think, is to try to generate traffic to the splog which will raise its ranking on search engines such as Google.
Why would these parasites wish their sad, unoriginal and lack-lustre little splogs to achieve such favour? There might be several motives, I suppose. A splog with a high profile might attract advertising, for example, and thus generate income.
So does scraping really matter? After all, doesn’t it possibly give your blog extra publicity? There are several answers to that, depending on the different sorts of blogs that are scraped. In the first place, most honest people deprecate with various degrees of passion such parasitical use of their original work. If you are an artist or writer or a provider of information, you do not want your work to be stolen and credited to someone else. At the very least, scraping is a breach of copyright. Added to this, you, the author, have no control over where your work is exhibited and may find your posts appearing where you do not wish them to appear, for example on porn sites.
Are there any defences against scraping? None that I can see. A lot of bloggers now put copyright or creative commons licence announcements on their blogs. If you use WordPress, there is a plug-in * designed to insert a copyright notice in your text if this is quoted in its entirety. I don’t think these help much because machines don’t take any notice of copyright notices and scrapers probably pay no need to them either.
Are there, then, any remedies once the offence has occurred? If you can find a contact address, you can try asking the splogger to remove your content. Some bloggers report success in this. If you cannot contact the splogger or if the latter does not respond, you can contact the Internet service provider who is hosting the splog. This too has met with some success.
This is all very reminiscent of the fight against spammers. It costs time and possibly money to go along this route and unless you are a business blogger and you feel your business is being jeopardized, it may not be worth the trouble.
In any case, such victories, if they are won, are piecemeal. For every splog that you persuade to delete your post or is closed down, several more will appear to continue their parasitical activities. As with spammers, we are on a hiding to nothing in the absence of a global strategy to deal with the problem.
There are no copyright notices on my blog. This is not because I do not value my work but because I believe that the blogging community in general is honest and because I do not want to deter others from quoting me or linking to me for perfectly legitimate reasons whereas those who rip off my content will ignore copyright notices anyway. Ending every post with “© 2008 SilverTiger” would, I feel, be giving in to hysteria. You may disagree with me and, if so, good luck to you. Let me know whether it makes any difference and I will perhaps change my mind.