Now that 2.0 is in public testing I switched back to looking at the parser. There have been a number of posts about random pieces of compendium data which do not scrape correctly. I decided to investigate these by attempting to scrape every published book and seeing what breaks and why.
Since the compendium feature is a new component of the parser, I never went back and ran serious testing against all the modules I had completed. In fact I've only been using it on material published after Divine Power. Any publication that predates that I parsed using either a purchased pdf or OCR from my purchased printed copy. This means several of the "big" books (PHB, MM, etc) never got the full testing.
That's not say I ran no testing on them, but I typically ran a sampling, because extracting and processing something like the entire Adventurer's vault is a pain. It's even more of a pain when your extraction process is still a bit suspect and you know you're going to have to do it more than once.
Anyhow, I decided to bite the bullet and go through each and every book and note what's broken, what I can fix, and what's beyond my control. I've been going in alphabetical order (for lack of a better system) and I am currently on the Player's Handbook and Player's Handbook 2. Both had issues with certain rituals which I think are fixed. I still need to re-scrape/parse them both and make sure the issue is taken care of and I didn't break anything else along the way.
I'm hopeful that I'll be able to complete the process this week.