Bug in SSIS 2005 - or is it a feature?

Discussion in 'ASP .Net' started by John Kotuby, Mar 16, 2009.

  1. John Kotuby

    John Kotuby Guest

    Hi all,

    I was trying to accomplish I thought would be a simple task as part of a
    much larger data migration. Import a comma delimited CSV file into a SQL
    table. As it turns out, if there is a bad-data row in the source CSV the
    data-flow task does not simply pass that row via the "error" workflow arrow
    into another Row Count task meant to keep track of the error count. It then
    should move on to the next record (that was easy to configure in DTS 2000).

    Instead the whole Data Flow task is failed. It appears that I don't even get
    a chance to intercede in the event of a bad source record. I read a few
    posts about older packages failing after .NET 3.5 SP1 is installed.

    Apparently the PrimeOutput() method of the data-flow task simply fails the
    task upon encountering a malformed record

    I tried redirecting failed rows to a Row Count transform which redirects the
    output to a variable of INT32.
    The red line from the mdc_msc.csv source is even labelled "Flat File Source
    Error Output".

    However, the package is stall halted with a Fatal error.
    It is clear to me after looking at the CSV file in a text editor that on row
    1032 (of 155,000) there is a
    missing comma delimeter. Of course I could fix that line only to discover
    another 20 or so bad records.

    This package must be run daily to update a working database. It must run
    unattended and merely keep track of failed rows.

    I get the "input" files from a 3rd party so cannot fix whatever might be
    causing the error in the CSV file used for input.

    Below I have typed the error messages.

    Error: The column delimeter for column "Column 13" was not found.
    Error: An error occurred while processing file "mdc_msc.csv" on data row
    Error: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on
    component "Source -
    mdc_msc_csv"(1) returned error code 0xC0202092. The componenet returned a
    failure code when the pipeline
    engine called PrimeOutput(). The meaning of the failure code is defined by
    the component, but the error is
    fatal and the pipeline stopped executing. There may be error messages posted
    before this with more
    information about the failure.
    Error: SSIS Error code DTS_E_THREADFAILED. Thread "SourceThread0" has exited
    with error code 0xC0047038.

    Thanks for any comment. I wish I could present this to a Microsoft SSIS
    John Kotuby, Mar 16, 2009
    1. Advertisements

  2. Alexey Smirnov, Mar 16, 2009
    1. Advertisements

  3. John Kotuby

    John Kotuby Guest

    An expert said to direct the error output to a Row Count task. Unfortunately
    that does not work. The failure to "prep" the input souce row apparently
    causes the Data Flow task to abend. This really does appear to be a bug, if
    there is no way to capture the error at the source in a Data Flow task.

    Just wondered if anybody in this group has run across the same problem. I am
    putting this out there so that if someone does have a similar problem, they
    will know it's probably not due to a coding error on their part.

    A work-around is to run a Bulk Insert task (rather than DataFlow) with the
    help of a Format file. However, that method does not produce any Error File
    John Kotuby, Mar 16, 2009
  4. John Kotuby

    John Kotuby Guest

    Thank you Alexey,
    I will try to post in the proper groups next time.

    I have scanned the forum posting and what I get out of it is that I may have
    to run a Script task to examine each row for errors (meaning I need to
    anticipate what may be wrong with the source data -- a difficult
    maybe impossible task) and then route the rows to a certain pipeline
    depending on the results of the Script task

    In the data I was using there appeared to be 1 bad row out of 150,000. The
    bad data was caused by too many quotes surrounding a character field. That
    apparently threw off the Field synchronization between the source and the
    meta-data table created by the package. In my opinion that occurrence should
    then spit the row out through the Error path (the little red arrow) to
    whatever garbage collector I wish to use, and not cause the task to fail

    I admit that there is probably a lot that I don't understand about SSIS
    2005, however I think my take on the situation is just good common sense.
    Let's just say that if I were coding a tool like this, a malformed row would
    be spit and recorded...simple as that.

    I have seen numerous posts about slight variations in SQL Service packs or
    even just patches that will cause a SSIS package to run on one machine and
    fail on another. Whatever happened to backward compatability within the same
    Product generation?

    Although it is a wrong group, did you already see this?
    John Kotuby, Mar 16, 2009
  5. John Kotuby

    Paul Shapiro Guest

    I believe there is a data transfer task property to set the allowed number
    of failures. It defaults to 0, so any error cancels the task, but I seem to
    remember you could increase that limit. Then you'll get the error rows in
    the error output, and the good rows should be processed.

    You could also import the data into a pre-qualifying table with a single
    varchar field, if you can check for erroneous rows with sql.
    Paul Shapiro, Mar 17, 2009
  6. John Kotuby

    Paul Shapiro Guest

    Paul Shapiro, Mar 17, 2009
  7. Alexey Smirnov, Mar 17, 2009
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.