Unhandled exception on PLR conversion


This is more odd than I originally thought.

It's an EndOfStream exception, thrown by BinaryReader.FillBuffer(2) at line 408.

Could not read 2 more bytes of 2

Stack trace:
at LinqToStdf.BinaryReader.FillBuffer(Int32 length)
at LinqToStdf.BinaryReader.ReadToBuffer(Int32 length, Boolean endianize)
at LinqToStdf.BinaryReader.ReadToBuffer(Int32 length)
at LinqToStdf.BinaryReader.ReadUInt16()
at LinqToStdf.Records.V4.Plr.ConvertToPlr(UnknownRecord unknownRecord)
at LinqToStdf.RecordConverterFactory.Convert(UnknownRecord unknownRecord)
at LinqToStdf.StdfFile.<InternalGetAllRecords>d__23.MoveNext()
at LinqToStdf.StdfFile.<SetStdfFile>d__1.MoveNext()
The STDF file that reveals this bug is approximately 100MB, and the exception is occurring when converting the PLR record starting at offset 0x3492. The PLR header is 5C 01 01 3F, indicating a PLR with 348 bytes after the header. A WCR record follows after the PLR, which is in turn followed by all kinds of goodness. The UnknownRecord passed to the converter has content of 348 bytes. So I don't think this is as simple as a truncated file.

More to come...
Closed Apr 12, 2012 at 7:50 PM by Selzhanik
I'm content this bug has been resolved, with end of stream tolerant string array conversion, coupled with a clearer exception message when this sort of thing happens (though this particular case wouldn't).


Selzhanik wrote Jun 16, 2009 at 9:48 PM

I think I'm figuring this out...

The PLR in question has a group count of 172. That leaves 346 bytes. The group index array is 344 bytes (172 x 2), so that would leave 2 bytes. Since the reader is not at the end of the stream after populating the indexes, it's going to attempt to read another 344 bytes worth of data, to populate the 172 group operating modes. There are only 2 bytes left at this point (which by the way translate to a group mode of 31116, and I'm not sure what that means, other than it's reserved by Teradyne and file is from a J750).

So now we're left with an interesting choice. Obviously we shouldn't simply throw an exception that never gets handled. But I can see a few potentially valid actions to take:
  • Convert to a CorruptDataRecord
    • In the converter factory, catch exceptions when converting UnknownRecords, convert to a CorruptDataRecord instead
    • Change the PLR converter so that it can return either a CorruptDataRecord or a Plr
  • Convert what can be converted as a valid PLR
    • Assume that if we run out of bytes while decoding either the mode or radix arrays, the remainder of the array values should be interpretted as 0 (the missing data indicator), and add that "partial" array to the PLR that is returned
    • Assume that if we run out of bytes while decoding any of the arrays, the array alone is invalid, and should be left missing from the returned PLR
Some supporting info about this particular PLR...
  • The PLR pin indexes (172 of them) include 349-352 (implying pins), 32768-32934 (implying pin groups), and 32 (again implying a pin), in that order
  • The final 2 bytes of that PLR are 8C 79, which could be interpretted as an operating mode of 31116 for pin index 349 (which is a valid pin index)
  • The PLR is preceded by 448 PGRs, all of which are exactly the same: 11 (4 + 7) bytes, group index of 0, no group name, 1 pin in the group, which has a pin index of 0. These are completely useless and technically invalid PGRs (because index < 32k) but they do imply that over 167 do actually exist, and could be numbered starting at 32768, if only the PGRs were written correctly.
  • The PGRs are preceded by 357 PMRs, whose indexes include 32, 349-352, and many others. These actually seem to be valid and useful, with the typical per level information.
I really don't like any of these options I've thought of. For this particular PLR, I think it's fairly obvious that having an array of indexes by itself, without any of the subsequent array data, is simply stupid. But does pin index 349 actually have an operating mode of 31116? If so we wouldn't want to simply throw that away. On the other hand, do we trust that 31116 is actually a valid operating mode? Given that the other pins' indexes are listed, yet the array is cut short?

And then the bigger question I think is whether we should catch converter exceptions and re-convert those unknown records to CorruptDataRecords. I think we should, and I was honestly under the impression we already were... Not sure why, but I was. Any reason why not?

I can really see us making two changes here... one to catch exceptions and convert to CorruptDataRecord, and another to nicely finish off PLR arrays if the record runs out prematurely, with the intent of salvaging what information is available.

Selzhanik wrote Jun 16, 2009 at 10:36 PM

After talking with Mark, decided to try this:
  • Catch exceptions encountered while converting UnknownRecord in StdfFile.InternalGetAllRecords
  • Convert the unknown record into a CorruptDataRecord
  • CorruptDataRecord will result in a thrown StdfFormatException (if that option is on), whose...
    • Message indicates the record failed conversion
    • InnerException captures the original exception
    • ErrorRecord holds the CorruptDataRecord

Selzhanik wrote Jun 16, 2009 at 10:56 PM

Not going to be exactly like this... will experiment a bit, get a change set checked in soon

marklio wrote Jun 17, 2009 at 2:08 AM

Fun stuff. I think handling some set of exceptions during conversion and pushing CorruptDataRecords throught he stream is goodness overall. In fact, after thinking about this, I'm pretty sure that I had wanted to build that into the converter codegen (and perhaps I did, I'll have to check) rather than wrapping the conversion in a try/catch. If that's the case, we can go with the model that converters are in charge of their own "recovery", which may be a good model since the converter has a much better idea of the intent of any given operation and can choose the set of recoverable problems within a properly bounded record, as well as provide useful error messaging.

If that's the case, what to do specifically inside the PLR converter hinges somewhat on whether the pattern you've found occurs often or not. one options would be to have a "pure" PLR converter, and a record filter that catches these "dirty" PLRs (by parsing the corrupt records) and gets the useful data out of them. Another would be to register a "wrapper" converter, which would take the original conversion, catch the exception, and create a DirtyPLR record that would give you the useful information, while alerting you to the fact that it's yucky.

All these options follow the principle of keeping the defaults true to the spec, while allowing special cases through the extensibility model.

Just some thoughts.

Selzhanik wrote Jun 17, 2009 at 11:04 PM

I'll get back to you tonight about what I'm up to with regards to the corrupted data record changes...

But until then, here's something to chew on. The STDF spec for the PLR has a note that says:
"empty arrays (or empty members of arrays) can be omitted if they occir at the end of the record"

So it would seem that it's technically valid to end the record mid-array, and the remainder of the array should be default values. Ugly in my opinion, but that's the spec.

Also, please take a look at the spec and tell me what you think... The 4 string array fields are designated as kxCn, which according to the spec is a series of Cn's. A Cn is defined as a variable length character string, begining with the length as the first byte, followed by the string itself. Yet, in the explanation in the PLR notes, it seems to imply that each of these strings in each of the 4 arrays may only have a length of 1.

Now, if I were writing the spec, I would have made these arrays either Cf or kxC1 (which would basically mean the same thing) and eliminate all those extra 0x01's. But you seem to have coded it as if it actually were C
f or kxC*1. I haven't yet found an STDF with strings in the PLR (I haven't looked all that hard). Have you seen an STDF where those arrays exist? And if so, is it encoded in the way you have the code written? I'd hate to change the code based on a perceived disagreement with the spec, when a real life STDF shows the characters encoded smartly. But maybe this is just a bug based on a misread of the spec, and we just haven't had an STDF to demonstrate the bug yet?

marklio wrote Jun 18, 2009 at 3:19 AM

I'd have to take a look at the implementation again, but it's quite possible that the PLR converter is full of bugs that haven't been exposed for whatever reason. It definitely looks like you're right about the strings. My implementation seems to be doing kxC*1. Probably copy/paste error.

marklio wrote Apr 5, 2012 at 3:37 AM

I re-read that part of the spec. It definitely seems possible and according to the spec to end an array in the middle of the array. The rest of the things this discussion covers are interesting, but I'm going to see how tricky it will be to implement that functionality for array reading. (that is, read until you hit end of stream, and just stop (provided the record still makes sense).

marklio wrote Apr 5, 2012 at 3:45 AM

On deeper thought, it is unclear precisely what this means. Since PLR is weird enough to have its own converter, I'll let us handle this in PLR if we need it, and assume that since the note is in the PLR section that it only applies there.

marklio wrote Apr 5, 2012 at 3:46 AM

I'll let you decide what to do with this. I'm fine with not fixing this.

Selzhanik wrote Apr 9, 2012 at 8:58 PM

Coming back to this after a long time, since I'm into the PLRs for writing an unconverter. I found a good example of a PLR from a J750 STDF. This particular PLR has ProgramStatesRight strings that are all "01LHMVXW". I assume these correspond to off, on, low, high, mid, valid, etc. ReturnStatesRight strings are "LHMG". So the idea behind these is that they are indeed string arrays, but their content is a series of single chartacters, which represent the single-character possible state values in FTRs and MPRs. Or in the case of when the "Left" arrays are populated, two-character state values.

The spec does clearly state that the arrays do not need to be fully populated with "missing/invalid" values if there are no more valid values to write, nor are there any more valid fields to write. Thus, the current converter has a bug, which I'll be fixing shortly along with the implementation of the unconverter.

Selzhanik wrote Apr 9, 2012 at 9:14 PM

While converting an unknown record to PLR should tolerate hitting end-of-record before an array is complete, there is the question of array sizing. Say we expect 74 ProgramStatesRight strings, but we run out of record after the 54th string. Do we fill the remaining 20 values with String.Empty? Or do we resize the string[74] to a string[54]? My vote is for resizing.

That brings us to unconverting a PLR to an unknown record -- we have the option of encoding it such that it truncates the array writing as the spec says it can do, or padding the end of the array with missing values. My vote here is to truncate it, but only if the array itself is truncated. For instance, if PlrGroupCount is 74 and Plr.ReturnStatesRight is a string[54] (and the remaining two arrays are null), then the Plr ends after the ReturnStatesRight array's 54th entry. That way, the behavior is controllable. If one wants a Plr's ReturnStatesRight array padded with missing values, then one may assign String.Empty or null to the 55th through 74th entries.

My plan here is to write the PLR arrays out in the way that they are assigned.

Selzhanik wrote Apr 11, 2012 at 10:42 PM

The latest push should take care of this, but I need to test it.

Selzhanik wrote Apr 12, 2012 at 6:21 PM

Wow. No, my latest push of the PLR converter/unconverter is worthless. Will have a new version pushed later today.

Selzhanik wrote Apr 12, 2012 at 7:48 PM

I'd like to clean up the PLR unconversion code extensively now that we're leaning back toward not CodeGen'ing it, but the latest push takes care of the bugs and it now can "round-trip" some reasonably sized PLR records without any differences.