After much pondering and mulling since the last post I’ve realized why the .xsd for MARCXML is so lousy. It’s intended to be very broad, and validate against an incredibly wide assortment of MARC. To that end, they essentially allowed any three number combination to be entered as the tag attribute of the datafield. You can enter invalid numbers, like 114 or 470, and they’ll validate. Similarly, you can enter any digit or any lower-case letter as an indicator for any field, which again will break the spec! It even allows bizarre characters as subfields like ? or &. To my knowledge there is no field which allows subfield ?, but they’re validating against it anyway.
At first, I planned to use Schematron, having been suggested to it by @LibSkrat. It solved some of the problems I’d faced earlier in my XML work — that XSD 1.0 wasn’t able to constrain or restrict a value of an element or attribute based on its parents value. But XSD 1.1 can! Hooray!
So tonight I embark on a vain-glorious mission. Writing (and annotating, ’cause you GOTTA annotate) my very own MARCXML Schema which will validate against actual values of tags and indicators and subfields. I’ll try to be as generous as I can, I’ll make mistakes, and I’ll share as I go.
Wish me luck!