Parsing TAR files

Post author
Denys Grozenok

When I parse a tar file consisting of multiple EDI files concatenated using SharpZipLib I have to iterate through the files in the same TAR stream and create X12Reader on the same stream at different positions. When the reader is disposed however it also disposes the base stream. Is it possible to add a boolean property to the X12ReaderSettings class to specify is the base stream should be left open after disposing the reader? Then when creating the StreamReader internally you would use this property to set the last parameter of the StreamReader constructor to indicate if we want the stream to be left open after closing the reader.

public StreamReader (System.IO.Stream stream, System.Text.Encoding encoding, bool detectEncodingFromByteOrderMarks, int bufferSize, bool leaveOpen);

Comments

6 comments

  • Comment author
    Admin

    Yes, absolutely, we'll add this option in the next release.

    0
  • Comment author
    Admin

    BTW, the reader.Item has a BytesRead property, which counts the number of bytes read from the beginning so you might find this useful.

    0
  • Comment author
    Denys Grozenok

    Thank you for considering the additional configuration setting! For now I'm just disposing the underlying stream and don't dispose the X12Reader at all. Would you think it may be ok as a workaround or there could be some resources leaked?

    Thanks for the hint on the BytesRead property. Could you please elaborate on how this could be useful in my scenario? Comparing it with the stream size to decide if the last file in the stream is read and close the X12Reader at that point only?

    Thanks,

    Denys

    0
  • Comment author
    Admin

    There shouldn't be any leaks and it makes sense as a workaround. I just pointed out that the BytesRead does exist in case it's applicable for your scenario.

    X12Reader should be able to go over the TAR stream (or any stream) and locate\parse EDI data as it is found, without you having to create X12Readers at different positions (I think, I don't really know the contents of the TAR stream). Set the ContinueOnError flag to tell the reader to carry on. I'll be interested if that would work.

     

     

    0
  • Comment author
    Denys Grozenok

    Yes, I can use the tar file as is with the ContinueOnError flag. It does have some header information for each file entry such as the file name etc., so the Reader seems to be skipping those ok with the ContinueOnError flag. The only concern with that approach is that I'm losing the information about the actual file name processed, I have to keep track of the exact file names processed, even though I'm thinking now that it would probably be more accurate to use the ISA.InterchangeControlNumber_13 instead of the file names for the tracking purposes. What would be your recommendation?

    0
  • Comment author
    Admin

    The reader will return the first 100 characters of the data that can't be parsed in the ReaderErrorContext, so if the file name can be found among those first 100 characters then you can use the ContinueOnError flag, otherwise you'll have to do it your way.

    The ISA control number is a good thing to keep track of, however bear in mind that there could be multiple interchanges batched in the same file, so you might have a file name and more than one ISA in that file.

    0

Please sign in to leave a comment.