Showing posts with label flat. Show all posts
Showing posts with label flat. Show all posts

Friday, March 23, 2012

Multiple flat files to multiple tables

Hi,

I have searched but not found quite the best way to look at this so far..

I have an application that outputs data to several text files (up to 30). These have commonality by an object name, but then contain completely different column data.

In DTS I had each of the source text file connections going to one OLE DB connection and then individual transform data tasks pointing to the one OLE DB connection.

Looking at SSIS, it would appear that I would need to have one source and one destination for each of these and therefore 30 parallel data flows?

Just wondering if there is a neater way of doing this?

It is a regular data import that happens a few times a day - the text files are named the same as the SQL tables - ie app_userdata.txt goes to app_userdata table.

Hope that explains ok and thanks in advance.

Mike

Since each source file and each destination table has its own schema, I don't know of any better way than what you describe.

Personally I would rather see 30 packages, each with one data flow, rather than one package with 30 data flows, but that's largely a personal preference, not a hard and fast rule or anything...

|||Thanks for the response - hadn't considered a package for each so will take a look at that|||The advantage that I see in the "one package per destination table" approach is that it makes ongoing maintenance much simpler. For smaller projects it seems like overkill, but it serves me well...

Multiple flat file import to corresponding SQL server tables

We are trying to use SSIS Import export wizrd to import the flat files (CSV format) that we have into MS SQL Server 2005 database tables. We have huge number of CSV files. Is there a way by which we can import these flat (CSV) files in to corresponding SQL server tables in a single shot. I would really appreciate this help as it is painful to convert each and every file using the Import Export wizard.

Thank you very much,

Regards,

Srinivas Raveendra

I am also facing the same problem.|||The simplest way I can see that is to create a package per every file-table destination pair. Each package could have a ForEach loop to process all files that have the same structure.|||

The way we have accomplished this task is to have a table that holds the names of all the files and the names of the tables each file is loaded into.

The SSIS package reads all the rows from this table and runs a foreach loop container for each record; The looping process firstly truncates the staging table and then bulk inserts based on the variable values for each iteration of the loop container. the syntax for the Bulk Insert would look something like this:

EXEC('BULK INSERT ' + @.strTable + ' FROM ' + @.strFileDir + @.strFile + '.csv''')

If the files you are importing are not exactly the same format as the tables you are importing them into. ie the columns are in a different order or there are more/less columns in the file than in the table, you will need to have format files set up and the syntax with the format file would look something like this:

EXEC('BULK INSERT ' + @.strTable + ' FROM ' + @.strFileDir + @.strFile + '.csv'' WITH (FORMATFILE=''' + @.strFormatDir + @.strFormat + '.fmt'')')

As a note we have the variables @.strFileDir and @.strFormatDir set at the package level, and dynamically controlled through a package configuration.

For more information on Bulk Insert see Books Online: http://msdn2.microsoft.com/en-us/library/ms188365.aspx

Hopefully this helps.

sql

Friday, March 9, 2012

multiple data flows

Easy: read a SQL table with 500 fields, transform, write to flat file using SSIS.

But, I have hundreds of transformations to define using Lookups and Aggregates, Derived Column transformations. I wan to group the data flow transformations in usable (reasonable size) groups (packages, containers, subroutines, whatever you want to call it).

I cannot figure out a simple easy way of doing this most "simple" obvious thing.

Am I the only one on the planet who needs to do this?

Thx.

Newbie.

Hi Paul,

No you are not the only one that has seen a need to modularize the data flows. The current product is the version 1.0 and unfortunately it does not give you an intuitive way to do this.

Currently, you may be able to achieve some level of modularization by breaking your data flows and connecting them through staging objects (raw files, flat files, temporary tables, etc).

We are actively looking into improving this experience, though.

It will be there in one of the future releases.

Thanks,

Bob

|||I think your question is how to break up what would ordinarily be a very large data flow (hundreds of transformations?!), into multiple smaller data flows.

If so, you would use Raw Files to save the data pipeline to the disk so one data flow can end and another can begin. The Raw File destination will write to a file on the disk at the end of one data flow and the Raw File source will read from it to begin the next data flow. The raw file contains the pipeline metadata, so you don't have to worry about defining your columns (500?!) each time.
|||Okay, this makes sense and I was looking to do exactly what the two responses above suggest. Just thought this should be a LOT easier than this... and it sounds like it will be some day...

In case anyone is listening, what I would really like is to highlight a set of (connected) data flows and right-click and select Save As Sub Data Flow... and then give it a name... That's all. I understand that the mechanics behind the scene makes this not so easy -but, whatever, it should be possible to do this. MS: please make it so.

Thanks !
|||Thanks!|||

Paul Des wrote:

Okay, this makes sense and I was looking to do exactly what the two responses above suggest. Just thought this should be a LOT easier than this... and it sounds like it will be some day...

In case anyone is listening, what I would really like is to highlight a set of (connected) data flows and right-click and select Save As Sub Data Flow... and then give it a name... That's all. I understand that the mechanics behind the scene makes this not so easy -but, whatever, it should be possible to do this. MS: please make it so.

Thanks !

Paul,

I don't think I'm giving too much away by saying this will appear in the product one day. I'm hoping beyond hope that it will be in katmai.

Note to MSFT, whatever it takes to get this into katmai - do it!!!! Smile

-Jamie

|||

Paul - thanks for the feedback. We have been working on this as a key enhancement to the product and the experience is somewhat similar to what you describe above. As Jamie knows, not everything we're working on are for the next release, but it will not be very long before you have this available.