-
April 25th, 2019, 21:15 #21
- Join Date
- Jun 2018
- Posts
- 23
Last edited by mostcallmetim; April 25th, 2019 at 23:20.
-
April 26th, 2019, 21:08 #22
If you want some of the information on why text recognition (from OneNote or other apps) often has problems, you can look up "ligature" and see that in many fonts, some character pairs are actually placed in the file as a single 'special' character. Plus, even though the recognition software tries to determine the font style used, their are a nearly unlimited number of fonts and variations that look exceedingly similar. Plus the issue of image quality.
Unfortunately, unless the PDF is always saved with all text and fonts included, often times you are going to get errors. And since many people use PDF to protect their files so they can not easily be "copied" to other sources, they don't want to use those features.
Problems? See; How to Report Issues, Bugs & Problems
On Licensing & Distributing Community Content
Community Contributions: Gemstones, 5E Quick Ref Decal, Adventure Module Creation, Dungeon Trinkets, Balance Disturbed, Dungeon Room Descriptions
Note, I am not a SmiteWorks employee or representative, I'm just a user like you.
-
April 26th, 2019, 21:13 #23
Most of the time when I fully extract all data from PDFs, the full font isn't even included—they're 'subfonts' with only the characters present in the text. Must be an InDesign function?
If the poster above is working with older material now in PDF form (such as AD&D 1E and 2E) that's because WotC requested people send in their best quality scans a few years back because the original manuscripts were lost, so they'll naturally be images rather than proper text and you'll be limited to the OCR capabilities of whatever software you use to attempt to figure out what the text is. Text styles, custom characters, ligatures, and many other factors (not the least of which is the image quality itself) all contribute to OCR fallibility.
-
April 26th, 2019, 22:40 #24
Don't know. But I do know most PDF print drivers have an option to embedd all fonts. So at least in some cases it depends upon what options are selected when the PDF is created.
And, as you point out, sometimes even the publishers don't have much option as to how the PDF is created and we (the end users) have no say in what features the PDF includes.
Problems? See; How to Report Issues, Bugs & Problems
On Licensing & Distributing Community Content
Community Contributions: Gemstones, 5E Quick Ref Decal, Adventure Module Creation, Dungeon Trinkets, Balance Disturbed, Dungeon Room Descriptions
Note, I am not a SmiteWorks employee or representative, I'm just a user like you.
-
January 4th, 2020, 17:44 #25
- Join Date
- Jan 2017
- Location
- Cleveland Ohio
- Posts
- 7
I would suggest Okular (It's a KDE product that can be run on Windows) which has been working really well for me when I work on conversions.
-
January 15th, 2020, 19:12 #26
- Join Date
- May 2018
- Posts
- 199
Here is a couple of things that may interest you.....cheers!
Pulling text from an image using google docs, I have done this and it was easy. Once the text was pulled was simple as copy/paste and a little proof reading.
https://www.youtube.com/watch?v=eC6VmwWEcXw
Also I haven't used this yet but will be soon, should include on your list Project: Author by Celestian.
https://www.fantasygrounds.com/forum...project+authorLast edited by Beemanpat; January 15th, 2020 at 19:15. Reason: correct spelling
Thread Information
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks