Protect Your Most Valuable Blog Resource, Stop Content Scraping and Plagiarism

| September 17, 2010 | Leave a Comment

There’s a very popular saying amongst bloggers, and it goes: content is king. As a blogger, your content is your most precious resource. I don’t know about you, but I’m not going to let sploggers and feed scrapers take that away from me. Not if I can help it. Not if you can help it. How?

Label your feeds with copyright notices.

Add your name, website, and URL (site URL or post URL) to your feed so that when it is read elsewhere, others will know where it really came from.

Recommendation: FeedEntryHeader Plugin. Many feed customization plugins exist, but I like this particular plugin because it affixes the necessary information before the content of the post rather than after, as feed scrapers usually truncate the content. And if you can help it, spell out the URL in plain text to your website or blog post rather than link to it using HTML. Scrapers will definitely want visitors to think they didn’t steal someone else’s content.

Feedback: Do you use summaries instead of full feeds because you don’t want scrapers to access them? Or do you provide both?

Block questionable visitors.

If they can’t find your blog, they won’t be able to take advantage of it.

Recommendation: AntiLeech Plugin. This plugin ideally stops potential scrapers from accessing your website content and instead feeds them fake content. You can enter either IP addresses or User Agent strings that identify the scrapers. Read more about AntiLeech here.

The tricky part is figuring out who your enemy is. They will have to scrape your feed first for you to know about it, right? You can use ©Feed to figure out who is reading your feeds, but more often than not they actually send trackbacks to your post once they’ve scraped it, so you can get their IP address from that as well.

Feedback: Where do you find your IP address blacklists?

Disable hotlinking.

Hotlinking is a term that describes how other people use your content with your own server bandwidth, which is how much data your server transfers over a period of time. Every time someone loads your website, all those files that get loaded is equal to a certain bandwidth. So if people keep hotlinking your photos, music, or videos, your bandwidth quota for the month (or quarter or year) gets used up. Now hotlinking may not be an issue for you—if you have lots of bandwidth, and don’t care about attribution or who uses your content. Normally it is; it’s bad netiquette. If you do care, you need to stop people from hotlinking.

Recommendation: Hotlink Protection Plugin. Enter the file location which you want to protect, and if an external website loads any image from it, a different image will be displayed (which is customizable). Since images are the most common target anyway, this plugin will suffice.

Feedback: Do you host your own images or do you hotlink them from sites like PhotoBucket?

*Note: What the plugins can accomplish can also be done in less straightforward but more flexible methods like PHP programming, .htaccess editing, cPanel configuration, web applications.

Take action.

Protecting your content isn’t just about setting up defense mechanisms. You should be vigilant enough to find out if you’ve been scraped or plagiarized and then do something about it.

Recommendation: 6 Steps to Stop Content Theft. These are six long and tough steps, but if you value your work, you will be thankful when it gets you through:

  1. Detection
  2. Preserving the Evidence
  3. Contact the Plagiarist (if Practical)
  4. Contacting the Advertisers (optional)
  5. Contacting the Host
  6. Contacting the Search Engines

Feedback: Do you think Filipino bloggers stand a chance in a battle against plagiarism, with all these (US-biased) steps that need to be accomplished?

Feedback: Do you know that Creative Commons Licenses like the CC Attribution 3.0 License have been ported to play nicely with Philippine copyright laws?

Sugod mga kapatid!

Right now, fighting plagiarism especially in the form of sploggers and scrapers is very tedious. Hopefully things get easier in the future, but for now, at least we stand a very good chance against it.

Related Posts

Tags: , , , , , , , , ,

8 Comments

  1.   Protect Your Most Valuable Blog Resource, Stop Content Scraping and Plagiarism by The Philippines According to Blogs Said,

    […] There’s a very popular saying amongst bloggers, and it goes: content is king. As a blogger, your content is your most precious resource. I don’t know about you, but I’m not going to let sploggers and feed scrapers take that away from me. Not if I can help it. Not if you can help it. How? Click here. […]

  2. Protect Your Most Valuable Blog Resource, Stop Content Scraping and Plagiarism : PinoyBlogoSphere.com (PBS) Said,

    […] There’s a very popular saying amongst bloggers, and it goes: content is king. As a blogger, your content is your most precious resource. I don’t know about you, but I’m not going to let sploggers and feed scrapers take that away from me. Not if I can help it. Not if you can help it. How? Click here. […]

  3. Karlo.PinoyBlogero Said,

    Thank you very much for this wonderful post! I’ve been having this problem for quite some time now and I am very thankful that you made this article.

    Get ready you content scrapers! I’m going to protect my content! Grrrr.

  4. Links indesejáveis e como retaliar | WordPress-PT Said,

    […] e fácil de implementar para impedir o blog scraping. Quem quiser aprofundar a questão tem ainda este artigo que aborda também o problema do plágio. Por indesejáveis que sejam, e desagradáveis os seus […]

  5. When Blogs Become Unacknowledged Mainstream Media Sources | WordPress Philippines Said,

    […] Blogs being scraped and plagiarized by other blogs is one thing, but what do you do about the beast that is mainstream media? It’s very disappointing for this type of thing to happen, since television and newspaper companies often consider themselves far more legitimate and reputable than anything that comes from the Internet. […]

  6. Smart SEO Blog » Blog Archive » Wordpress Against SEO Blogs! Said,

    […] that take content from other blogs and re-publish it without permission (this is sometimes called scraping). If a blog contains all or mostly stolen and unoriginal content, it’s gone! • SEO blogs: Blogs […]

  7. Sploggers Beware! « Boehman’s Blogging Bits Said,

    […] interesting site is  Wordpress Philippines.  They have a great post on protecting your blog from […]

  8. poch Said,

    Great article. I’m not really concerned about my blog contents being copied as long as there’s attribution. I would like to suggest CopyScape too as a great tool against plagiarism. Sugod!

RSS feed for comments on this post · TrackBack URI

Leave a Reply