[INDOLOGY] Query about the misuse of CC-licensed articles

Harry Spier vasishtha.spier at gmail.com
Tue Dec 12 00:40:02 UTC 2023

Yesterday I did a search a google search on one of the header lines in the
GRETIL etexts and I didn't find any of their etexts on bogus sites.  The
etexts of Muktabodha I found on bogus sites were all pdfs.    So it might
be that these automated web scrubbers just grab any pdfs and only pdfs to
create these scam websites.
Harry Spier

On Mon, Dec 11, 2023 at 7:28 PM Dominik Wujastyk <wujastyk at gmail.com> wrote:

> Thanks, Harry, this is very interesting.
> Yes, Scribd has a takedown procedure.  It's a bit laborious, but it does
> work well.  Once you've filled in the form, they take down the stuff
> quickly and without cavil.   Some of it pops up again after a year or two.
> Best,
> Dominik
> On Wed, 6 Dec 2023 at 16:16, Harry Spier <vasishtha.spier at gmail.com>
> wrote:
>> Dominik asked in this context:
>>> Have you ever had one of your articles copied and hosted on a website
>>> you didn't like?  Was there anything you could do about it?
>>> This was a problem for the Muktabodha digital library in the past but
>> appears to be getting better.
>> In 2010 from my notes:
>> . A search on Google showed:
>>  7 muktabodha e-texts added to bogus digital libraries in the last week
>> alone,
>> 46 copies total from 26 individual e-texts in the last month
>> 178 copies total from 71 individual e-texts. in the last year,
>> I'm not sure if the Google result for all time is correct.  At one point
>> it was saying there are 8,900 muktabodha e-texts on the internet, but it
>> also gave me a result of 4,500 copies of 61 individual e-texts. (I'm not
>> able to explain these discrepencies of the number of individual e-texts
>> other than its an estimation.  In any case there are a lot of Muktabodha
>> e-text copies out there).
>> But the bottom line is that about 1/3 of our e-texts in multiple copies
>> are on many bogus digital library sites and the number is increasing by the
>> week.
>> By "bogus digital libaries" I mean sites created by automated webcrawlers
>> that collect pdf and .doc files and create collections as clickbaits.
>> WhenI tried to access many of these sites, I was blocked by my Malware
>> software because of trojans and malware on those sites.
>> Today it seems to be getting better.  I don't know if its getting better
>> because the Muktabodha digital library is now password protected or if
>> Google is taking these automated pdf collection sites down quicker. A quick
>> search showed only about 10 sites with Muktabodha etexts but my malware
>> software stopped me entering three of those sites because of what it called
>> "fraud". One of Muktabodha's etexts was added by someone into wikisource
>> but with the licence changed by them to Creative Commons.
>> If I remember correctly there is a mechanism to take down articles from
>> scribd.
>> Harry Spier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://list.indology.info/pipermail/indology/attachments/20231211/bbf550b3/attachment.htm>

More information about the INDOLOGY mailing list