Discussion:
[Wikimediaindia-l] Zero output for OCR using Google Drive API
Bodhisattwa Mandal
2018-03-15 12:00:36 UTC
Permalink
Hi,

For last few weeks, Google OCR using Drive API has stopped working for few
Indic scripts like Bengali and Devanagari affecting Bengali, Sanskrit and
Assamese Wikisource. It's still working for other Indic scripts, I guess.

If anyone have any contact with Google, we need to know if this temporary
or any plan for major change is underway.

This is extremely important as this will drastically sooner or later affect
every Indic Wikisource projects and we have to plan accordingly.

Regards,
Bodhisattwa
Tito Dutta
2018-03-16 09:47:55 UTC
Permalink
Hi,
True, this is really important. I tried the manual OCR process, and the
result is same as you have mentioned above (blank pages).
I am not fully sure if I can find someone from Google to get some update on
it, but let's see.


Thanks
Tito Dutta
Note: If I don't reply to your email in 2 days, please feel free to remind
me over email or phone call.
Post by Bodhisattwa Mandal
Hi,
For last few weeks, Google OCR using Drive API has stopped working for few
Indic scripts like Bengali and Devanagari affecting Bengali, Sanskrit and
Assamese Wikisource. It's still working for other Indic scripts, I guess.
If anyone have any contact with Google, we need to know if this temporary
or any plan for major change is underway.
This is extremely important as this will drastically sooner or later
affect every Indic Wikisource projects and we have to plan accordingly.
Regards,
Bodhisattwa
_______________________________________________
Wikimediaindia-l mailing list
To unsubscribe from the list / change mailing preferences visit
https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l
Shrinivasan T
2018-03-18 17:00:49 UTC
Permalink
Jayantanth helped on exploring this and found that PDF to text is not
working for bengali.
But Image to text is working fine.

Currently working adding PDF to Image conversion to OCR4wikisource.

Using Imagemagick tool for this conversion, which is eating all the CPU and
crashes my 8GB laptop.

Exploring for other options for slow conversion, for safe work.

Will update soon on this.


Shrini

Loading...