Sunday, May 11, 2014

Scribd’s Improved Copyright Protection Systems Reaping Results for Smashwords Authors

Scribd was founded about six years ago as a document upload service.  To their credit, they made the upload of multi-format documents and files ridiculously easy.

Today, their site hosts 50 million documents and books, receives thousands of new uploads every day, and is visited by 80 million monthly readers from over 100 different countries.

The downside of their ease-of-upload, however, is that many users – often enthusiastic readers who don’t understand or respect copyright – attempt to upload unauthorized, copyrighted works, including the ebooks of many Smashwords authors.  It’s the same challenge YouTube faces with unauthorized video uploads.

Following our announcement  in December that Smashwords would soon begin supplying ebooks to Scribd, both for retail sales and for inclusion in their subscription service, I heard from several Smashwords authors who were understandably upset to discover unauthorized versions of their books at Scribd.

Some questioned why we would partner with Scribd.  As I’ve shared with authors who have expressed concern, and as I’ll reiterate here, we wouldn’t have partnered with Scribd if we weren't confident their heart was in the right place, and if we weren't confident our relationship with Scribd would benefit all indie authors.

Scribd has 80 million readers, and we’ve got over 300,000 books.  We want to connect those readers' eyeballs and wallets and purses with our books.  We want our authors to receive the full payment they deserve for their hard work.

Scribd wants the same thing. Scribd wants to do right by the indie author community because they know their business is dependent upon earning and deserving the trust and support of authors everywhere.  

With Scribd’s enthusiastic blessing, I orchestrated a conference call in January for Scribd’s top executives with several concerned Smashwords authors. Scribd wanted to hear our authors' concerns, and then after listening Scribd shared their plans to combat the unauthorized uploads.

At the end of the call, Scribd shared how they planned to release a major update to their copyright protection system in the next few weeks.

I’m pleased to report that Scribd is delivering. Scribd has made some impressive strides over the last few months toward eradicating unauthorized content.

Scribd has since renamed their copyright protection technology Book ID.  In a nutshell, here's how it works:  BookID automatically scans all Smashwords-delivered books, and analyzes the text for semantic data such as word count, letter frequency, phrases, and other elements. BookID then creates a digital fingerprint of the authorized Smashwords book, and uses this fingerprint to automatically detect and remove unauthorized versions.  It proactively removes all files at Scribd that match the same fingerprint, and also uses this fingerprint to proactively block the upload of future unauthorized versions.

Simply by distributing to Scribd via Smashwords, our authors receive a measure of protective benefit from the BookID technology.

I want to share some hard numbers with you to illustrate the progress Scribd has made to respect and protect the copyright of Smashwords books.

As of January 9, the BookID had detected and removed 3,745 book files from Scribd representing 1,725 unique Smashwords books. 

In March, Scribd released a new and improved version of BookID, as they promised they would during their call with the Smashwords authors.  The new BookID system has dramatically increased Scribd’s ability to detect unauthorized versions. 

As of today, Scribd reports to me BookID has removed 47,858 unauthorized copies of 14,090 unique Smashwords books.

Although no automated scanning system will every be 100% accurate, I’m pleased by the progress and effort made by our friends at Scribd.  I’m pleased that every visitor to the Scribd home page is prompted to purchase a paid subscription to Scribd’s service, because this converts free readers to paid readers for the benefit of Smashwords authors.

Thanks to the support of Smashwords authors who now supply 225,000 titles to Scribd, it's getting tougher for users to upload unauthorized content. With every new Smashwords title delivered, the cleanup continues.  The situation will improve further as Scribd enhances their BookID technology in the future.

Learn more about Scribd's BookID technology at and learn more about Scribd’s commitment to protect your copyright at their new and improved Copyright Resource Center.

I’ll report more exciting Scribd news tomorrow regarding the impressive sales growth we're seeing at Scribd, plus some other big news so stay tuned!


adan said...

Mark, I'd like to add that Scribd has done an astounding job working with me to correct a problem of BookID removing books of mine that shouldn't have been automatically removed. There were a few other glitches involved in that corrective process, and all those have gone incredibly well also. And I'd have to say it's due to the level of author customer service I'm receiving from Scribd.

As you say, their heart is in the right place. And their actions prove it.

Very much looking forward to sales news at Scribd, and any and all other exciting news you've got for us. Thanks so much, sincerely,


Kathy Steinemann said...

This is fantastic news. I don't like DRM, because it limits the functionality of a book for valid users. BookID seems to be a good alternative.

TheSFReader said...

What would be great too is statistics on False Positive : books that are removed but shouldn't.

Also I'd guess people would wantr to know what recourse they can have if one of their book is removed, how/when they are contacted about it...

Mark Coker said...

SFR, I don't have data on false positives, but it's a interesting question. I do know there have been some, especially around public domain. I know of one instance where a single SW publisher was generating an inordinate number of false positives so their content was removed from the automatic fingerprinting. When a direct uploader has content removed, Scribd sends them an email that contains instructions on how to reverse the erroneous removal. Scribd is tweaking and improving their BookID tech all the time. It's an iterative process of improvement.

Inkling said...

Thanks for the good news. I'm glad I did elect to post to Scribd.

A couple of questions than hinge on the fact that I post ebooks to Smashwords as epubs.

1. Will we soon be able to supply a PDF as well, so that PDF can be posted on Smashwords and perhaps sent to Scribd?

2. Any idea when Smashwords will support epub 3.0? I send an epub 3.0 file to Apple and even to Amazon. It'd be great if I could sent the same file to Smashwords. My life would be simpler.

One final note about false positives at Scribd. Could a procedure be worked out so that Smashwords authors get notified if one of their books fails BookID? We're not doing a direct upload, so they don't have our contact information.

Mark Coker said...

Inkling, answers:

1. That's under consideration, but given everything else I see on the roadmap for 2014, I'm not sure it'll happen.

2. This year. Shhh., that's a secret. But yes, Epub3 is planned for direct uploads.

3. SW books won't fail BookID, because Scribd is using the SW versions as the authorized reference editions.

Bea Cannon said...

AHA! Okay, that explains why two of my titles have disappeared from Scribd.
I'd already put a request in to them because I was puzzled as to what happened to them.
So, does that mean those titles were already illegally on their site?
If so, it appears that if an illegal copy is already there, and a legit one is uploaded from SW both are deleted when detected.
I probably wouldn't have known for a while that those titles were missing if a friend hadn't told me. I'm inclined to agree with Inkling that there should be some kind of notification process for SW authors.

adan said...

I'm gonna have to chime in and agree with both Bea & Inkling, but in an even more general request: for authors to get notifications if their book file fails to achieve premium status, if their book file gets a ticket from Apple or other outlet, if a book is deleted from anywhere, not just Scribd or Oyster, for any reason.

And I say this as "more general" because I don't know the logistics re notification, but certainly it shouldn't be impossible, esp in re to the situation with Scribd (who I love btw).

Goals like this, would help make Smashwords into the type of author friendly(er) system, I believe, many of us would really appreciate.

I'm also hoping tomorrow's surprise news has some good info re our sales and views reports from Scribd.

These are not complaints Mark (from me), but what is rarely possible with other major indie outlets, open engagement. And for that, I definitely thank you!

Best wishes -


Ariadne Wayne said...

I have seen this in action for myself. After uploading samples some time ago and forgetting they were there, I received an email to say they had been deleted because the Smashwords versions were on there. I was very impressed. :)

ray p daley said...

There needs to be data where your system can't tell when a Scribd user name and Smashwords author are one and the same person.

Especially when said author has loaded their free content onto Scribd before you bought it out then gets threatening "cease & desist" emails in regard to their OWN copyrighted content.

I've already had to tell your IT department off twice for sending me those automatic mails. Try involving some real humans in the process BEFORE emails get sent out. People have common sense and reason, not cheap algorhythms.

adan said...

There's a nice follow up Scribd related article, about Smashwords authors being featured at Scribd :

widdershins said...

Very cool news! Smiling I am.

mulgamarg said...

I found one of my books on Scribd some months ago. I sent them an email and the copied book was removed right away. I then subscribed to Scribd and have been delighted with their range. So glad that thanks to Smashwords they now have all legit copies of my books too.

Karen Myers said...

And yet...

No sooner do my Smashwords titles (10) appear on Scribd, then all 10 of them are posted on a pirate site ( I am working on getting them removed from there right now.

How can we protect ourselves downstream from Scribd when there is no cost for a Scribd user to pirate my titles?

Mark Coker said...

Karen, I'm not familiar with that site, but please forward a direct hyperlink to any book page you think is unauthorized at the site to our support team and they can take a look. I don't know if this applies to download genius, but we do often get reports from authors concerned about pirate sites which turn out to *not* be pirate sites but instead are simply linking to samples or book pages of authorized sites or retailers (their business model is affiliate marketing). A common one causing concern is We've seen others that are truly scams that don't host files, but are entirely focused on stealing the credit card numbers. Regardless, most sites (including this download genius one) have DMCA takedown links. Use those but please do send our support team direct links so we can take a look as well.