I've been doing test runs using SSPX and the AV. I've finally got it producing something usable. A lot of the AV information isn't presented as neatly as the AV book is laid out. The mount table for example is missing. Also the compendium's "Item Query" returns multiple types of data so really getting this working was as hard as getting any three other types of compendium data working.
Finally, and this is the painful part, the compendium lists each item/level separately. Really that's the most useful way to present it. However that means each stat block in the printed AV is (on average) 4 entries in the compendium. In fact the compendium AV returns close to 2900 results.
Now as some of you who played with scraping things like the PHB/Divine Power/etc might have noticed, larger datasets can take some time to scrape. In the case of the AV...it takes somewhere between an hour and an hour and a half to scrape. A lot of this time is actually spent delaying.
SSPX has a designed delay inbetween each query. Mostly so it plays nice with the wizards webserver and doesn't attempt to pound it to pieces. For small sets this isn't bad. For something of the AV's scale....it becomes annoying. Really it's still faster than doing it by hand.
Just remember when you're frustrated...that I had to run it well over a dozen times finding and fixing bugs before I got a successful parse out of it.
In parser news, I added a 1.5.1 parse option which moves the flavor text back into the formatted text block so it can be displayed on legacy rulesets. I also altered the handling of the compendium Enhancment: tag. It now handles "+1 AC" by placing "+1" in the bonus field and "AC" in the enhancement field which is what the PDF based data did.